rizinorg / rizin

UNIX-like reverse engineering framework and command-line toolset.
https://rizin.re
GNU Lesser General Public License v3.0
2.64k stars 352 forks source link

PLT Trampoline functions are not renamed with a name containing the target function name #3233

Open alessandrocarminati opened 1 year ago

alessandrocarminati commented 1 year ago
Questions Answers
OS/arch/bits (mandatory) Ubuntu x86 64
File format of the file you reverse (mandatory) ELF
Architecture/bits of the file (mandatory) aarch64
rizin -v full output, not truncated (mandatory) rizin 0.5.0 @ linux-x86-64

Expected behavior

Don't blame me, I'm not even sure this is a bug. Surely it is something I expected and that I can not see on rizin. To be fair, I had the same issue with radare2, where I opened a bug (https://github.com/radareorg/radare2/issues/21117). Let me summarize my issue: I'm automating a few static analysis tasks by using the pipe. In this particular analysis I'm doing, I face against an unusual case, where a function is both implemented and requested externally. The case is malloc in glibc. You can find it both implemented and where is needed it is invoked through the PLT as if it was an external dependency. This behavior is wanted since the library allows the malloc functions to be replaced with another allocator. Saying that, my problem: When rizin prints the disassembly of a calling function, I expected to find something like sym.imp.malloc, instead I have something like fcn.00026ee0. Looking inside the trampoline function, I can see in a comment that the emulation found it to be malloc (it writes this info in a comment), but since I'm automating the process it is far less usable. In the radare issue I saw that the PLT is not properly tagged, but rizin I didn't notice the same. That's why I less confident this is an issue. Still, this behavior of renaming the trampoline function with something containing the target name, is so diffused that also the plain binutils objdump has it.

$ aarch64-linux-gnu-objdump -dS /tmp/libc.so.6 

/tmp/libc.so.6:     file format elf64-littleaarch64
[...]
000000000006e4f0 <__fopen_internal>:
   6e4f0:   a9bc7bfd    stp x29, x30, [sp, #-64]!
   6e4f4:   910003fd    mov x29, sp
   6e4f8:   a90153f3    stp x19, x20, [sp, #16]
   6e4fc:   2a0203f4    mov w20, w2
   6e500:   a9025bf5    stp x21, x22, [sp, #32]
   6e504:   aa0103f6    mov x22, x1
   6e508:   f9001bf7    str x23, [sp, #48]
   6e50c:   aa0003f7    mov x23, x0
   6e510:   d2803b00    mov x0, #0x1d8                  // #472
   6e514:   97fee273    bl  26ee0 <malloc@plt>
[...]

Actual behavior

what I got from rizin is:

$ ./build/binrz/rizin/rizin /tmp/libc.so.6
WARNING: No calling convention defined for this file, analysis may be inaccurate.
[0x0002c980]> aaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze function calls
[x] Analyze len bytes of instructions for references
[x] Check for classes
[x] Finding xrefs in noncode section with analysis.in=io.maps
[x] Analyze value pointers (aav)
[x] Value from 0x00000000 to 0x00187ea8 (aav)
[x] 0x00000000-0x00187ea8 in 0x0-0x187ea8 (aav)
[x] Emulate functions to find computed references
[x] Analyze local variables and arguments
[x] Type matching analysis for all functions
[x] Applied 0 FLIRT signatures via sigdb
[x] Propagate noreturn information
[x] Resolve pointers to data sections
[x] Use -AA or aaaa to perform additional experimental analysis.
[0x0002c980]> s sym.__fopen_internal
[0x0006e4f0]> pdf
            ; CODE XREF from sym.fopen @ 0x6e604
┌ sym.__fopen_internal ();
│           ; var int64_t var_10h @ sp-0x30
│           ; var int64_t var_20h @ sp-0x20
│           ; var int64_t var_30h @ sp-0x10
│           0x0006e4f0      stp   x29, x30, [sp, -0x40]!
│           0x0006e4f4      mov   x29, sp
│           0x0006e4f8      stp   x19, x20, [sp, 0x10]
│           0x0006e4fc      mov   w20, w2
│           0x0006e500      stp   x21, x22, [sp, 0x20]
│           0x0006e504      mov   x22, x1
│           0x0006e508      str   x23, [sp, 0x30]
│           0x0006e50c      mov   x23, x0
│           0x0006e510      movz  x0, aav.0x000001d8                   ; "L[\x15"
│           0x0006e514      bl    fcn.00026ee0                         ; fcn.00026ee0
│       ┌─< 0x0006e518      cbz   x0, 0x6e5f8
│       │   0x0006e51c      add   x5, x0, 0xe0                         ; 0x2b8
[...]

Notes

Saying that, I'm seeing unresolved "PLT" issues in both radare and rizin, I wonder if the problem I believe to see is related.

https://github.com/radareorg/radare2/issues/21117       Dec  1, 2022    PLT entries aren't flagged properly 
https://github.com/radareorg/radare2/issues/17523       Aug 25, 2020    PLT stub names not being resolved properly
https://github.com/radareorg/radare2/issues/17300       Jul 17, 2020    Elf - incorrect information about PLT entries
https://github.com/rizinorg/rizin/issues/153            Dec 10, 2020    PLT stub names not being resolved properly

Anyhow, I do really like to have a comment on this issue.

I attach the glibc file libc.so.6.gz I tested, but I believe this behavior to be visible with any glibc file, also in more common architectures like the x86_64.

ret2libc commented 1 year ago

It is very likely to be a relocation issue, unfortunately they are not consistently handled yet. We already looked briefly at these kind of issues and we should definitely work on them at some point, as they produce a few issues.

Thanks for filing this issue! I'll try to have a better look and maybe work on a solution in next days.