rizinorg / rizin

UNIX-like reverse engineering framework and command-line toolset.
https://rizin.re
GNU Lesser General Public License v3.0
2.72k stars 363 forks source link

analyze dynamically loaded libraries #4578

Open imyxh opened 4 months ago

imyxh commented 4 months ago

I may be doing something wrong, but when I try to debug an application that calls dlopen(), the loaded library is listed under dm, and I can see all the symbols from dmia, but when I look at the disassembly it's not actually decorated with the symbols. afl doesn't list any flags associated with the corresponding symbols. In the disassembly, all the function calls are just addresses, etc.

I've been away from using rizin for a while so I don't remember when/how we parse symbols into flags and functions and whatnot. Would it be reasonable for aa to bring in any new symbols when run?

This issue seems to be equivalent to https://github.com/rizinorg/cutter/issues/2619

Rot127 commented 4 months ago

It works for me. Maybe you have to continue execution until a libc function or other function from the dynlib is called and the dynamic linker performs the patching? This is what I did below and it works fine.

> rizin ../capstone/build/cstool

[0x002971b0]> ood
Process with PID 8931 started...
...

[0x7d112c488040]> dm
0x00005f94de42c000 - 0x00005f94de6c3000 - usr   2.6M s r-- /home/user/repos/capstone/build/cstool /home/user/repos/capstone/build/cstool ; gs_base
0x00005f94de6c3000 - 0x00005f94de85f000 - usr   1.6M s r-x /home/user/repos/capstone/build/cstool /home/user/repos/capstone/build/cstool ; sym._init
0x00005f94de85f000 - 0x00005f94df3cf000 - usr  11.4M s r-- /home/user/repos/capstone/build/cstool /home/user/repos/capstone/build/cstool ; obj._IO_stdin_used
0x00005f94df3cf000 - 0x00005f94df585000 - usr   1.7M s rw- /home/user/repos/capstone/build/cstool /home/user/repos/capstone/build/cstool ; home_user_repos_capstone_build_cstool.rw
0x00007d112c46b000 - 0x00007d112c46c000 - usr     4K s r-- /usr/lib64/ld-linux-x86-64.so.2 /usr/lib64/ld-linux-x86-64.so.2 ; usr_lib64_ld_linux_x86_64.so.2.r
0x00007d112c46c000 - 0x00007d112c494000 * usr   160K s r-x /usr/lib64/ld-linux-x86-64.so.2 /usr/lib64/ld-linux-x86-64.so.2 ; usr_lib64_ld_linux_x86_64.so.2.r_x
0x00007d112c494000 - 0x00007d112c49e000 - usr    40K s r-- /usr/lib64/ld-linux-x86-64.so.2 /usr/lib64/ld-linux-x86-64.so.2 ; usr_lib64_ld_linux_x86_64.so.2.r.7d112c494000
0x00007d112c49e000 - 0x00007d112c4a2000 - usr    16K s rw- /usr/lib64/ld-linux-x86-64.so.2 /usr/lib64/ld-linux-x86-64.so.2 ; usr_lib64_ld_linux_x86_64.so.2.rw
0x00007ffe679a1000 - 0x00007ffe679c3000 - usr   136K s rw- [stack] [stack] ; stack_.rw
0x00007ffe679d1000 - 0x00007ffe679d5000 - usr    16K s r-- [vvar] [vvar] ; vvar_.r
0x00007ffe679d5000 - 0x00007ffe679d7000 - usr     8K s r-x [vdso] [vdso] ; vdso_.r_x

[0x7308128ff040]> s sym.usage_0x29749e 
[0x57bfdb79649e]> db
[0x57bfdb79649e]> dc
hit breakpoint at: 0x57bfdb79649e

[0x57bfdb79649e]> dm
0x000057bfdb4ff000 - 0x000057bfdb796000 - usr   2.6M s r-- /home/user/repos/capstone/build/unk0
0x00007308126d7000 - 0x00007308126ff000 - usr   160K s r-- /usr/lib64/libc.so.6 /usr/lib64/libc.so.6
...
# Note how libc is loaded now
...
0x00007308126ff000 - 0x000073081286c000 - usr   1.4M s r-x /usr/lib64/libc.so.6 /usr/lib64/libc.so.6
0x000073081286c000 - 0x00007308128ba000 - usr   312K s r-- /usr/lib64/libc.so.6 /usr/lib64/libc.so.6
0x00007308128ba000 - 0x00007308128be000 - usr    16K s r-- /usr/lib64/libc.so.6 /usr/lib64/libc.so.6
0x00007308128be000 - 0x00007308128c0000 - usr     8K s rw- /usr/lib64/libc.so.6 /usr/lib64/libc.so.6
0x00007308128c0000 - 0x00007308128ca000 - usr    40K s rw- unk1 unk1
0x00007308128e2000 - 0x00007308128e3000 - usr     4K s r-- /usr/lib64/ld-linux-x86-64.so.2 /usr/lib64/ld-linux-x86-64.so.2 ; usr_lib64_ld_linux_x86_64.so.2.r
0x00007308128e3000 - 0x000073081290b000 - usr   160K s r-x /usr/lib64/ld-linux-x86-64.so.2 /usr/lib64/ld-linux-x86-64.so.2 ; usr_lib64_ld_linux_x86_64.so.2.r_x
0x000073081290b000 - 0x0000730812915000 - usr    40K s r-- /usr/lib64/ld-linux-x86-64.so.2 /usr/lib64/ld-linux-x86-64.so.2 ; usr_lib64_ld_linux_x86_64.so.2.r.73081290b000
...

[0x57bfdb79649e]> aa
[0x57bfdb79649e]> aflm~printf
    sym.imp.printf
...
    sym.imp.printf
    sym.imp.printf
    sym.imp.printf
    sym.imp.printf
    sym.imp.printf
    sym.cs_snprintf
sym.printFloat_0x2a2d93:
sym.printFloatBang_0x2a2de3:
sym.printf32mem_0x2a3c1c:
sym.printf64mem_0x2a3cae:
sym.printf80mem_0x2a3ce4:
sym.printf128mem_0x2a3d1a:
...

[0x57bfdb79649e]> pdf @ sym.printf64mem_0x2a3cae

│ ┌ sym.printf64mem_0x2a3cae(int64_t arg1, int64_t arg2, int64_t arg3);                                                                                                                         │ - offset -       0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0123456789ABCDEF    │
│ │           ; arg int64_t arg1 @ rdi                                                                                                                                                          │ 0x7ffcf2158928  e873 79db bf57 0000 e88a 15f2 fc7f 0000  .sy..W..........    │
│ │           ; arg int64_t arg2 @ rsi                                                                                                                                                          │ 0x7ffcf2158938  0000 9000 0100 0000 0000 c000 0000 0000  ................    │
│ │           ; arg int64_t arg3 @ rdx                                                                                                                                                          │ 0x7ffcf2158948  0000 c000 0000 0000 0001 0000 0000 0000  ................    │
│ │           ; var int64_t var_20h @ stack - 0x20                                                                                                                                              │ 0x7ffcf2158958  8889 15f2 fc7f 0000 0600 0000 a600 0000  ................    │
│ │           ; var int64_t var_14h @ stack - 0x14                                                                                                                                              │ 0x7ffcf2158968  0000 0000 0000 0000 0000 0000 0000 0000  ................    │
│ │           ; var int64_t var_10h @ stack - 0x10                                                                                                                                              │ 0x7ffcf2158978  0000 0000 0000 0000 0000 0000 0000 0000  ................    │
│ │           0x57bfdb7a2cae      push  rbp                                                                                                                                                     │ 0x7ffcf2158988  ffff ffff 0000 0000 0000 0000 0000 0000  ................    │
│ │           0x57bfdb7a2caf      mov   rbp, rsp                                                                                                                                                │ 0x7ffcf2158998  0000 0000 0000 0000 0000 0000 ffff 0000  ................    │
│ │           0x57bfdb7a2cb2      sub   rsp, 0x20                                                                                                                                               │ 0x7ffcf21589a8  6006 9012 0873 0000 0000 0000 0000 0000  `....s..........    │
│ │           0x57bfdb7a2cb6      mov   qword [var_10h], rdi             ; arg1                                                                                                                 │ 0x7ffcf21589b8  e88a 15f2 fc7f 0000 608a 15f2 fc7f 0000  ........`.......    │
│ │           0x57bfdb7a2cba      mov   dword [var_14h], esi             ; arg2                                                                                                                 │ 0x7ffcf21589c8  8810 7012 0873 0000 108a 15f2 fc7f 0000  ..p..s..........    │
│ │           0x57bfdb7a2cbd      mov   qword [var_20h], rdx             ; arg3                                                                                                                 │ 0x7ffcf21589d8  e88a 15f2 fc7f 0000 40f0 4fdb 0100 0000  ........@.O.....    │
│ │           0x57bfdb7a2cc1      mov   rax, qword [var_10h]                                                                                                                                    │ 0x7ffcf21589e8  816f 79db bf57 0000 e88a 15f2 fc7f 0000  .oy..W..........    │
│ │           0x57bfdb7a2cc5      mov   byte [rax + 0x328], 8                                                                                                                                   │ 0x7ffcf21589f8  8fdb dc39 a762 d06d 0100 0000 0000 0000  ...9.b.m........    │
│ │           0x57bfdb7a2ccc      mov   rdx, qword [var_20h]                                                                                                                                    │ 0x7ffcf2158a08  0000 0000 0000 0000 0070 9112 0873 0000  .........p...s..    │
│ │           0x57bfdb7a2cd0      mov   ecx, dword [var_14h]                                                                                                                                    │ 0x7ffcf2158a18  b827 4adc bf57 0000 8fdb bc3e a762 d06d  .'J..W.....>.b.m    │
│ │           0x57bfdb7a2cd3      mov   rax, qword [var_10h]                                                                                                                                    │                                                                              │
│ │           0x57bfdb7a2cd7      mov   esi, ecx                                                                                                                                                │                                                                              │
│ │           0x57bfdb7a2cd9      mov   rdi, rax                                                                                                                                                │                                                                              │
│ │           0x57bfdb7a2cdc      call  sym.printMemReference_0x2a5d65   ;[1]                                                                                                                   │                                                                              │
│ │           0x57bfdb7a2ce1      nop                                                                                                                                                           │                                                                              │
│ │           0x57bfdb7a2ce2      leave                                                                                                                                                         │──────────────────────────────────────────────────────────────────────────────┐
│ └           0x57bfdb7a2ce3      ret  
imyxh commented 4 months ago

Hmm, not the same for me. Minimum workable example:

#!/bin/sh

CC=clang

cat > 4578.c <<EOF
#include <dlfcn.h>
int main()
{
    void *h = dlopen("./4578_dl.so", RTLD_NOW | RTLD_GLOBAL);
    void (*f)() = dlsym(h, "hello");
    if (f) f();
    return 0;
}
EOF

cat > 4578_dl.c <<EOF
#include <stdio.h>
void hello()
{
    puts("hello from dl");
}
EOF

$CC -o 4578_dl.so 4578_dl.c -shared
$CC -o 4578 4578.c

I can step all the way past the puts() call and then run aaa without rizin ever realizing that hello() or puts() even exists.

Rot127 commented 4 months ago

Ok, now I can reproduce it as well.

My linking knowledge is limited unfortunately. But I suspect Rizin can't detect dlopen() + dlsym() calls? The test from me above called printf via the PLT. This one just calls a pointer returned by dlsym(). It is possibly related to it?

Also I tried to link it with -fPIE and built with debug symbols + different RTLD_... arguments. Same behavior. But this was just guessing.

My suspicion would be that, that

  1. the debug symbols of the .so are not loaded on dlopen.
  2. it doesn't analyze the address returned by dlsym() (otherwise it would at least analyze hello() as function).

@thestr4ng3r I think you know way more about the linking code?

wargio commented 4 months ago

what is the output of ldd?

Rot127 commented 4 months ago
4578:
    linux-vdso.so.1 (0x00007ffc16d33000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f2ffd759000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f2ffd964000)

4578_dl.so:
    linux-vdso.so.1 (0x00007fff0b4f6000)
    libc.so.6 => /lib64/libc.so.6 (0x00007da8e4325000)
    /lib64/ld-linux-x86-64.so.2 (0x00007da8e4535000)
wargio commented 4 months ago

so the output is correct. if you add a breakpoint after loading the lib and run the binary and execute again the dm, you should see the loaded lib

imyxh commented 4 months ago

The dm output is fine, but it seems the symbols from the loaded lib aren't getting loaded and analyzed or something. In the example I gave the hello() function doesn't get flagged.

brightprogrammer commented 4 months ago

That is probably because the library file is not analysed when being loaded. It's being opened for debugging. To see symbols from libraries getting loaded, you need to load the libraries that are linked and for which you want to see the symbols. This must be done before you reopen the binary in debugging mode using ood.