radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
20.75k stars 3.01k forks source link

SIGSEGV when debugging large binary with DWARF data #22569

Open mcd1992 opened 9 months ago

mcd1992 commented 9 months ago

Environment

Tue Feb  6 01:55:06 PM CST 2024
radare2 5.8.9 31646 @ linux-x86-64
birth: git.5.8.8-1043-g7d8bad5ba1 2024-02-05__14:32:26
commit: 7d8bad5ba11b19e4a3520d3aeaf1796ce6b4efd0
options: gpl -O? cs:5 cl:2 make
Linux x86_64

Also just updated and tested with latest commit below
radare2 5.8.9 31662 @ linux-x86-64
birth: git.5.8.8-1049-g098669591c 2024-02-06__13:58:11
commit: 098669591ca0327619fd2df572ca81d2dfe50ec0
options: gpl -O? cs:5 cl:2 make

Description

When opening a large (2.3G) ELF bin with DWARF symbols radare will consume over 8G of RAM and get oom-killed. If I set a soft ulimit and let it run again it will SIGSEGV in a memcpy call. See bottom gdb notes. I'm not sure why the debug symbols for libr_util.so aren't showing up; I'm guessing something weird with the macros in the hashtable source C? Let me know if there's any extra info I can get from GDB.

Test

The binary is large so I can't easily distribute it but it can be downloaded/made. /palworld/Pal/Binaries/Linux/merged: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.10.93, BuildID[xxHash]=a37ec78630b980cc, with debug_info, not stripped

You'll need to get the old bins for the game dedicated server Palworld with something like https://github.com/SteamRE/DepotDownloader and get the old manifest files https://steamdb.info/depot/2394012/history/?changeid=M:4603741190199642564

Then eu-unstrip the executable and the .debug ELF together and just r2 merged.bin

GDB output

pwndbg> bt
#0  0x00007ffff5c491a1 in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff7cff328 in ht_pp_insert_kv () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#2  0x00007ffff7cfef84 in internal_ht_grow () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#3  0x00007ffff7cff13c in check_growing () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#4  0x00007ffff7cff334 in ht_pp_insert_kv () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#5  0x00007ffff7d08d74 in sdb_ht_insert_kvp () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#6  0x00007ffff7d1470e in sdb_set_internal () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#7  0x00007ffff7d147bf in sdb_set () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#8  0x00007ffff7d13843 in sdb_add () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#9  0x00007ffff5e34f53 in add_sdb_addrline (s=0x555555723060, addr=111037128, file=0x555670fd3190 "Runtime/Core/Public\\Containers/BitArray.h", line=0, column=3, mode=2, 
    print=0x7ffff7e5a7b9 <r_cons_printf>) at dwarf.c:1023
#10 0x00007ffff5e35739 in parse_spec_opcode (bin=0x5555555a5020, 
    obuf=0x7fffc35c6497 "\005\030\006\003\271\016\202\005\b\006<\005\027\006m\005\004\003wX\005\021K\005\v9\005\003\006.\005$\006\003\024.\004\367\001\005/\003\333qf\005\n\006\202\004\221\001\0058\006\003\250\016t\005\027x\005\a\006.\005\032\006\003F<\004\236\001\005\004\003\217~.\006\003\333st\004\221\001\005\032\006\003\226\016\272\004\236\001\005\004\003\217~.\006\003\333sf\004\221\001\005$\006\003\311\016t\005", len=82079570, hdr=0x7fffffffd2c0, regs=0x7fffffffd280, opcode=102 'f', mode=2) at dwarf.c:1156
#11 0x00007ffff5e35f4e in parse_opcodes (bin=0x5555555a5020, obuf=0x7fffc35c6322 "\004\236\001", len=82079570, hdr=0x7fffffffd2c0, regs=0x7fffffffd280, mode=2) at dwarf.c:1326
#12 0x00007ffff5e362fb in parse_line_raw (a=0x5555555a5020, obuf=0x7fffc0533010 "WE\001", len=133014489, mode=2, be=false) at dwarf.c:1400
#13 0x00007ffff5e3a42e in r_bin_dwarf_parse_line (bin=0x5555555a5020, mode=2) at dwarf.c:2603
#14 0x00007ffff77d22a4 in bin_dwarf (core=0x7ffff5a8e010, pj=0x0, mode=2) at cbin.c:1161
#15 0x00007ffff77e1608 in r_core_bin_info (core=0x7ffff5a8e010, action=5263359, pj=0x0, mode=2, va=1, filter=0x0, chksum=0x0) at cbin.c:4750
#16 0x00007ffff77ce9c9 in r_core_bin_set_env (r=0x7ffff5a8e010, binfile=0x55555571f190) at cbin.c:316
#17 0x00007ffff7789f6f in r_core_file_load_for_io_plugin (r=0x7ffff5a8e010, baseaddr=18446744073709551615, loadaddr=0) at cfile.c:450
#18 0x00007ffff778a81a in r_core_bin_load (r=0x7ffff5a8e010, filenameuri=0x555555798480 "/home/unknown/srcds/palworld/Pal/Binaries/Linux/merged", baddr=18446744073709551615) at cfile.c:658
#19 0x00007ffff60f1161 in binload (r=0x7ffff5a8e010, filepath=0x555555798480 "/home/unknown/srcds/palworld/Pal/Binaries/Linux/merged", baddr=18446744073709551615) at radare2.c:547
#20 0x00007ffff60f4651 in r_main_radare2 (argc=2, argv=0x7fffffffdca8) at radare2.c:1488
#21 0x00005555555556fd in main (argc=2, argv=0x7fffffffdca8) at radare2.c:118
#22 0x00007ffff5b18cd0 in ?? () from /usr/lib/libc.so.6
#23 0x00007ffff5b18d8a in __libc_start_main () from /usr/lib/libc.so.6
#24 0x0000555555555135 in _start ()
pwndbg> ctx
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
────────────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]────────────────────────────────────────────────────────────────────
*RAX  0x28
*RBX  0x5555555a5020 —▸ 0x55555557d3f0 ◂— '/home/unknown/srcds/palworld/Pal/Binaries/Linux/merged'
*RCX  0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
*RDX  0x28
*RDI  0x28
*RSI  0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
*R8   0xffffffff
*R9   0x0
*R10  0x55570dc40e50 ◂— 0x0
*R11  0x55570dc41000
*R12  0x0
*R13  0x7fffffffdcc0 —▸ 0x7fffffffe18d ◂— 'SHELL=/bin/bash'
*R14  0x7ffff7ffd000 (_rtld_global) —▸ 0x7ffff7ffe2d0 —▸ 0x555555554000 ◂— 0x10102464c457f
*R15  0x555555557c58 —▸ 0x5555555551b0 ◂— endbr64 
*RBP  0x7fffffffce40 —▸ 0x7fffffffcf00 —▸ 0x7fffffffcf20 —▸ 0x7fffffffcf60 —▸ 0x7fffffffcf90 ◂— ...
*RSP  0x7fffffffce08 —▸ 0x7ffff7cff328 (ht_pp_insert_kv+97) ◂— mov rax, qword ptr [rbp - 0x18]
*RIP  0x7ffff5c491a1 ◂— vmovdqu ymmword ptr [rdi], ymm0
─────────────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / set emulate on ]─────────────────────────────────────────────────────────────────────────────
 ► 0x7ffff5c491a1    vmovdqu ymmword ptr [rdi], ymm0
   0x7ffff5c491a5    vmovdqu ymmword ptr [rdi + rdx - 0x20], ymm1
   0x7ffff5c491ab    vzeroupper 
   0x7ffff5c491ae    ret    

   0x7ffff5c491af    nop    
   0x7ffff5c491b0    cmp    edx, 0x10
   0x7ffff5c491b3    jae    0x7ffff5c491e2                <0x7ffff5c491e2>
    ↓
   0x7ffff5c491e2    vmovdqu xmm0, xmmword ptr [rsi]
   0x7ffff5c491e6    vmovdqu xmm1, xmmword ptr [rsi + rdx - 0x10]
   0x7ffff5c491ec    vmovdqu xmmword ptr [rdi], xmm0
   0x7ffff5c491f0    vmovdqu xmmword ptr [rdi + rdx - 0x10], xmm1
──────────────────────────────────────────────────────────────────────────────────────────[ STACK ]───────────────────────────────────────────────────────────────────────────────────────────
00:0000│ rsp 0x7fffffffce08 —▸ 0x7ffff7cff328 (ht_pp_insert_kv+97) ◂— mov rax, qword ptr [rbp - 0x18]
01:0008│-030 0x7fffffffce10 —▸ 0x55555561d548 —▸ 0x7ffff5c4b710 ◂— endbr64 
02:0010│-028 0x7fffffffce18 ◂— 0xdbb500ffffffff
03:0018│-020 0x7fffffffce20 —▸ 0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
04:0020│-018 0x7fffffffce28 —▸ 0x555641c41cd0 —▸ 0x5556c1107d00 —▸ 0x5555b700f8f0 —▸ 0x5555b3997750 ◂— ...
05:0028│-010 0x7fffffffce30 —▸ 0x5555bb54bed0 —▸ 0x5555726c5cf0 ◂— '0x5dc3f80'
06:0030│-008 0x7fffffffce38 ◂— 0x28 /* '(' */
07:0038│ rbp 0x7fffffffce40 —▸ 0x7fffffffcf00 —▸ 0x7fffffffcf20 —▸ 0x7fffffffcf60 —▸ 0x7fffffffcf90 ◂— ...
────────────────────────────────────────────────────────────────────────────────────────[ BACKTRACE ]─────────────────────────────────────────────────────────────────────────────────────────
 ► 0   0x7ffff5c491a1
   1   0x7ffff7cff328 ht_pp_insert_kv+97
   2   0x7ffff7cfef84 internal_ht_grow+230
   3   0x7ffff7cff13c check_growing+42
   4   0x7ffff7cff334 ht_pp_insert_kv+109
   5   0x7ffff7d08d74 sdb_ht_insert_kvp+44
   6   0x7ffff7d1470e sdb_set_internal+1039
   7   0x7ffff7d147bf sdb_set+54
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
pwndbg> frame 1
#1  0x00007ffff7cff328 in ht_pp_insert_kv () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
pwndbg> ctx
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
────────────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]────────────────────────────────────────────────────────────────────
*RAX  0x28
*RBX  0x5555555a5020 —▸ 0x55555557d3f0 ◂— '/home/unknown/srcds/palworld/Pal/Binaries/Linux/merged'
*RCX  0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
*RDX  0x28
*RDI  0x28
*RSI  0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
*R8   0xffffffff
*R9   0x0
*R10  0x55570dc40e50 ◂— 0x0
*R11  0x55570dc41000
*R12  0x0
*R13  0x7fffffffdcc0 —▸ 0x7fffffffe18d ◂— 'SHELL=/bin/bash'
*R14  0x7ffff7ffd000 (_rtld_global) —▸ 0x7ffff7ffe2d0 —▸ 0x555555554000 ◂— 0x10102464c457f
*R15  0x555555557c58 —▸ 0x5555555551b0 ◂— endbr64 
*RBP  0x7fffffffce40 —▸ 0x7fffffffcf00 —▸ 0x7fffffffcf20 —▸ 0x7fffffffcf60 —▸ 0x7fffffffcf90 ◂— ...
*RSP  0x7fffffffce10 —▸ 0x55555561d548 —▸ 0x7ffff5c4b710 ◂— endbr64 
*RIP  0x7ffff7cff328 (ht_pp_insert_kv+97) ◂— mov rax, qword ptr [rbp - 0x18]
─────────────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / set emulate on ]─────────────────────────────────────────────────────────────────────────────
   0x7ffff7cff323 <ht_pp_insert_kv+92>     call   0x7ffff7c547e0 <memcpy@plt>
 ► 0x7ffff7cff328 <ht_pp_insert_kv+97>     mov    rax, qword ptr [rbp - 0x18]
   0x7ffff7cff32c <ht_pp_insert_kv+101>    mov    rdi, rax
   0x7ffff7cff32f <ht_pp_insert_kv+104>    call   check_growing                <check_growing>

   0x7ffff7cff334 <ht_pp_insert_kv+109>    mov    eax, 1
   0x7ffff7cff339 <ht_pp_insert_kv+114>    jmp    ht_pp_insert_kv+121                <ht_pp_insert_kv+121>

   0x7ffff7cff33b <ht_pp_insert_kv+116>    mov    eax, 0
   0x7ffff7cff340 <ht_pp_insert_kv+121>    leave  
   0x7ffff7cff341 <ht_pp_insert_kv+122>    ret    

   0x7ffff7cff342 <insert_update>          push   rbp
   0x7ffff7cff343 <insert_update+1>        mov    rbp, rsp
   0x7ffff7cff346 <insert_update+4>        sub    rsp, 0x30
──────────────────────────────────────────────────────────────────────────────────────────[ STACK ]───────────────────────────────────────────────────────────────────────────────────────────
00:0000│ rsp 0x7fffffffce10 —▸ 0x55555561d548 —▸ 0x7ffff5c4b710 ◂— endbr64 
01:0008│-028 0x7fffffffce18 ◂— 0xdbb500ffffffff
02:0010│-020 0x7fffffffce20 —▸ 0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
03:0018│-018 0x7fffffffce28 —▸ 0x555641c41cd0 —▸ 0x5556c1107d00 —▸ 0x5555b700f8f0 —▸ 0x5555b3997750 ◂— ...
04:0020│-010 0x7fffffffce30 —▸ 0x5555bb54bed0 —▸ 0x5555726c5cf0 ◂— '0x5dc3f80'
05:0028│-008 0x7fffffffce38 ◂— 0x28 /* '(' */
06:0030│ rbp 0x7fffffffce40 —▸ 0x7fffffffcf00 —▸ 0x7fffffffcf20 —▸ 0x7fffffffcf60 —▸ 0x7fffffffcf90 ◂— ...
07:0038│+008 0x7fffffffce48 —▸ 0x7ffff7cfef84 (internal_ht_grow+230) ◂— add dword ptr [rbp - 0x94], 1
────────────────────────────────────────────────────────────────────────────────────────[ BACKTRACE ]─────────────────────────────────────────────────────────────────────────────────────────
   0   0x7ffff5c491a1
 ► 1   0x7ffff7cff328 ht_pp_insert_kv+97
   2   0x7ffff7cfef84 internal_ht_grow+230
   3   0x7ffff7cff13c check_growing+42
   4   0x7ffff7cff334 ht_pp_insert_kv+109
   5   0x7ffff7d08d74 sdb_ht_insert_kvp+44
   6   0x7ffff7d1470e sdb_set_internal+1039
   7   0x7ffff7d147bf sdb_set+54
   8   0x7ffff7d13843 sdb_add+76
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
mcd1992 commented 9 months ago

Note it doesn't happen immediately and there is some debug notes a few minutes prior to it segfaulting. Might add some extra R_LOG_DEBUGs to the libr/bin/dwarf.c or shlr/sdb/src/ht.inc.c and see where its happening. If I open with -nn it doesn't happen but doing a oob will then trigger it.

DEBUG: empty symbol name
DEBUG: Symbol name outside the strtab section
DEBUG: Truncated corrupted section name: conditional<false, const Eigen::CwiseUnaryOp<Eigen::internal::scalar_conjugate_op<double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, -1, 1, 0, -1, 1> >, const Eigen::Map<const Eigen::Matrix
DEBUG: Truncated corrupted section name: conditional<false, const Eigen::CwiseUnaryOp<Eigen::internal::scalar_real_op<double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, -1, 1, 0, -1, 1> >, const Eigen::Map<const Eigen::Matrix<doub
DEBUG: Truncated corrupted section name: conditional<false, Eigen::CwiseUnaryView<Eigen::internal::scalar_real_ref_op<double>, Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, -1, 1, 0, -1, 1> >, const Eigen::Map<const Eigen::Matrix<double, -1
DEBUG: Truncated corrupted section name: conditional<false, Eigen::CwiseUnaryOp<Eigen::internal::scalar_conjugate_op<double>, const Eigen::Transpose<const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, -1, 1, 0, -1, 1> >, const Eigen::Map<co
trufae commented 9 months ago

Well this is an out of memory problem caused by your kernel (debian / ubuntu?) despite we can optimize the memory usage in dwarf importing i think it could be better to find alternative solutions.

The assembly you pasted here shows a nullptr + 0x28 delta, so it seems there's an allocation that fails, but will be good to know which one it is. and as long as this depends on the system i will probably not be able to reproduce in here.

Can you upload that binary somewhere?

mcd1992 commented 9 months ago

It looks like just the split debug file triggers it as well, no need to eu-unstrip. https://gofile.io/d/DNUA0G

This is on arch's kernel 6.6.10-arch1-1. The whole reason I'm wanting to open this though is for the DWARF data so I can generate zignatures / FLIRT for the debug-less versions. I could probably just dump the symbols+address and manually af name them before making zigs.

The same thing happens with a stock Unreal Engine 5 game, the .debug/DWARF file is 2G+ and will cause radare to OOM even on a system with 32G RAM.

trufae commented 9 months ago

I can't reproduce. on ubuntu i get the process killed because the kernel is picky and just kills the process when eats a lot of memory. but i can open this file without issues in macOS after consuming 50GB of ram. so this is not a sigsegv for me. Btw , after loading all the dwarf info, r2 eats about 16-20GB of ram. photo_2024-02-08 02 08 28

mcd1992 commented 9 months ago

Ah I haven't tested with a system above 32G. Using ulimit -Sv 8000000 or any value smaller than radare needs to build the full hashtables should trigger it.

I guess if it's just a side-effect of how radare does DWARF parsing the issue is more about your last 2 bullet points on re-writing the parser/hashtable implementation.

trufae commented 9 months ago

Agree, the current storage method for dwarf info does not perform well for large files like this, but afaik the crash is not caused by a bug in r2 code. So it will be better to redesign the way this information is stored in r2 instead of depending on a hashtable. FIxing things by throwing more metal is not the way to go.

trufae commented 9 months ago

i've introduced a void* to hold a private data storage to replace the current hashtable approach without breaking the ABI promise, this way i can fix that without holding the release for more time. so moving this ticket for 5.9.2 or so :)

trufae commented 3 months ago

moving forward, i had no time to work on this yet, but i hope ill be able to do it soon or late before 6.0 🤞 help is always welcome btw