open-education-hub / operating-systems

Teaching resources (OER) for Operating Systems
https://open-education-hub.github.io/operating-systems/
Other
63 stars 158 forks source link

software-stack/Library calls vs system calls: Additional details would be needed #193

Open AnghelAndrei28 opened 1 year ago

AnghelAndrei28 commented 1 year ago

In the exemplification of the "ltrace" with the argument < -x "malloc" ls > it shows the appearances of malloc inside ls executable file. I tried to use it on an executable file from the lab (eg. call) and these were the results:

student@os:~/.../lab/support/libcall-syscall$ ltrace -x "malloc" ./call fopen("a.txt", "wt" <unfinished ...> malloc@libc.so.6(472) = 0x562983d1e2a0 <... fopen resumed> ) = 0x562983d1e2a0 strlen("Hello, world!\n") = 14 fwrite("Hello, world!\n", 1, 14, 0x562983d1e2a0 <unfinished ...> malloc@libc.so.6(4096) = 0x562983d1e480 <... fwrite resumed> ) = 14 fflush(0x562983d1e2a0) = 0 +++ exited (status 0) +++

I was wondering why other functions are displayed besides malloc. Thank you!

darius-m commented 1 year ago

I think that for local (lab) binaries, the compilation flags also include -z lazy, which changes the linker's behavior, and makes ltrace work properly - it is able to see function calls that go through the global offset table (GOT). The default behavior on Ubuntu has been changed, making compilations use -z now and/or -z relro instead, which breaks ltrace.

The -x flag is meant to trace calls to external functions, which probably also displays the calls to malloc@libc.so.6, in addition to the normal library calls (malloc is used behind the scenes by fopen, fwrite and others).

The system binaries are likely compiled using -z now, which makes ltrace unable to properly identify all library function calls. A discussion on this topic is also #183.

molecula2788 commented 1 year ago

I would say the issue is not caused by -z now, but rather by the fact that the binaries have been compiled with support for Intel Control-flow Enforcement Technology (CET).

This causes the .plt section to be split in 2: .plt and .plt.sec. The first function calls goes through both .plt.sec and .plt, while the subsequent calls only go through .plt.sec. However, ltrace sets the breakpoint inside .plt, so it will miss every call but the first. (If you take a look at the output, you won't see any function appearing more than once).

Example:

#include <stdio.h>

int main() {
    printf("Hello\n");
    printf("World\n");
    return 0;
}

Compile with gcc -z ibtplt -o test test.c, then disassemble:

Disassembly of section .plt:

0000000000001020 <.plt>:
    1020:   ff 35 e2 2f 00 00       pushq  0x2fe2(%rip)        # 4008 <_GLOBAL_OFFSET_TABLE_+0x8>
    1026:   f2 ff 25 e3 2f 00 00    bnd jmpq *0x2fe3(%rip)        # 4010 <_GLOBAL_OFFSET_TABLE_+0x10>
    102d:   0f 1f 00                nopl   (%rax)
    1030:   f3 0f 1e fa             endbr64 
    1034:   68 00 00 00 00          pushq  $0x0
    1039:   f2 e9 e1 ff ff ff       bnd jmpq 1020 <.plt>
    103f:   90                      nop

...

Disassembly of section .plt.sec:

0000000000001050 <puts@plt>:
    1050:   f3 0f 1e fa             endbr64 
    1054:   f2 ff 25 bd 2f 00 00    bnd jmpq *0x2fbd(%rip)        # 4018 <puts@GLIBC_2.2.5>
    105b:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

...

0000000000001145 <main>:
    1145:   55                      push   %rbp
    1146:   48 89 e5                mov    %rsp,%rbp
    1149:   48 8d 3d b4 0e 00 00    lea    0xeb4(%rip),%rdi        # 2004 <_IO_stdin_used+0x4>
    1150:   e8 fb fe ff ff          callq  1050 <puts@plt>
    1155:   48 8d 3d ae 0e 00 00    lea    0xeae(%rip),%rdi        # 200a <_IO_stdin_used+0xa>
    115c:   e8 ef fe ff ff          callq  1050 <puts@plt>
    1161:   b8 00 00 00 00          mov    $0x0,%eax
    1166:   5d                      pop    %rbp
    1167:   c3                      retq

So what gets called by the code is the function inside .plt.sec, which then calls either the stub in .plt (first time) or the function inside libc directly (subsequent calls). However, ltrace will set its breakpoint in .plt (note the 0x030 offset of the breakpoint address), therefore missing the second call:

$ ltrace ./test
puts("Hello"Hello
)                                                                                                        = 6
World
+++ exited (status 0) +++
$ ltrace -D77 ./test |& grep 'add_breakpoint.*puts'
DEBUG: proc.c:919: proc_add_breakpoint(pid=3486206, puts@0x558ab2ea9030)

This issue was actually fixed in upstream ltrace. However, the latest stable version is quite old (0.7.3), so the only option is to compile it ourselves.

Here's the output on a manually compiled version of ltrace:

$ /opt/ltrace/bin/ltrace ./test
puts(0x5564d0720004, 0x7ffd16450a48, 0x7ffd16450a58, 0x7f222d1a4718Hello
)                                                 = 6
puts(0x5564d072000a, 0x5564d26402a0, 0, 0x7f222d0c18f3World
)                                                              = 6
+++ exited (status 0) +++

Now the output is correct, but the arguments of puts are completely messed up and I couldn't figure out why. Maybe somebody else can pick it up from here.