radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
20.58k stars 3k forks source link

heap inspection in r2 5.7.4 disfunctional - no dmh* command works as intended #20767

Open ghost opened 2 years ago

ghost commented 2 years ago

Environment

Sat Sep 17 06:51:47 PM CEST 2022
radare2 5.7.4 0 @ linux-x86-64 git.
commit: 5.7.4 build: 2022-09-17__18:09:59
Linux x86_64

Description

the entire suite of dmh* subcommands is useless. im sorry i have to say it like this but really, nothing works as intended. dmh related commands have the following issues: 1) cant find main arena, complains that "this address is not part of any arenas" have to check all arenas with dmha only to be greeted by this:

main_arena @ 0x7f832794f4a0
thread arena @ 0x0
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff
<-snip->
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff
thread arena @ 0xffffffffffffffff

I have to scroll all the way up for the main arena address, sometimes, it prints several kilobytes of "thread arena" into my terminal, when there are no threads to begin with. 2) dmhb and friends straight up dont work, shows a bunch of "corrupted" bins and chunk when they arent, dmh doesnt show any chunks only the top chunk, but sometimes it shows other chunks, feels random. 3) "corrupted" bins after 3 mallocs

 Bin 015:
 Bin 016:
  double linked list small bin {
    0x7f9fb0cd65e8->fd = 0x0Double linked list corrupted

  }
 Bin 017:
  double linked list small bin {
    0x7f9fb0cd65f8->fd = 0x0Double linked list corrupted

  }
 Bin 018:
 Bin 019:

4) example of a fast bin chunk of size 100 bytes

struct malloc_chunk @ 0x55f4de9b2187 {
  prev_size = 0xfffffea1e8c78948,
  size = 0xe8c78948f8458b48,
  flags: |N:0 |M:0 |P:0,
  fd = 0xb8fffffe95,
  bk = 0xec83480000c3c900,
  fd-nextsize = 0xc308c4834808,
  bk-nextsize = 0x0,
}
chunk too big to be displayed
chunk data = 
0x55f4de9b2197  0x000000b8fffffe95  0xec83480000c3c900   .............H..
0x55f4de9b21a7  0x0000c308c4834808  0x0000000000000000   .H..............
0x55f4de9b21b7  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b21c7  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b21d7  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b21e7  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b21f7  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2207  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2217  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2227  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2237  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2247  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2257  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2267  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2277  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2287  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b2297  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b22a7  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b22b7  0x0000000000000000  0x0000000000000000   ................
0x55f4de9b22c7  0x0000000000000000  0x0000000000000000   ................

6) dmhg shows nothing but the top chunk. 7) main_arena pointer doesnt actually point to the heap, so its like a garbage pointer of some sort 8) you get the picture.

cant use the module/functionality in any useful capacity. The rest of the program works fine. Is there another program i could use to inspect the heap more reliably?

The only real "expected" output I can provide, is the ones you feature in some videos and tutorials, and for example from pwndbg. i am providing output from pwndbg heap and vmmap commands for comparission

pwndbg> heap
Allocated chunk | PREV_INUSE
Addr: 0x55dabe0e0000
Size: 0x291

Free chunk (tcache) | PREV_INUSE
Addr: 0x55dabe0e0290
Size: 0x71
fd: 0x55dabe0e0

Allocated chunk | PREV_INUSE
Addr: 0x55dabe0e0300
Size: 0x71

Allocated chunk | PREV_INUSE
Addr: 0x55dabe0e0370
Size: 0x71

Top chunk | PREV_INUSE
Addr: 0x55dabe0e03e0
Size: 0x20c21

pwndbg> vmmap
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
    0x55dabcb78000     0x55dabcb79000 r--p     1000 0      /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/heap_spray_learning/a.out
    0x55dabcb79000     0x55dabcb7a000 r-xp     1000 1000   /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/heap_spray_learning/a.out
    0x55dabcb7a000     0x55dabcb7b000 r--p     1000 2000   /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/heap_spray_learning/a.out
    0x55dabcb7b000     0x55dabcb7c000 r--p     1000 2000   /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/heap_spray_learning/a.out
    0x55dabcb7c000     0x55dabcb7d000 rw-p     1000 3000   /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/heap_spray_learning/a.out
    0x55dabe0e0000     0x55dabe101000 rw-p    21000 0      [heap]
    0x7f7742a48000     0x7f7742a4a000 rw-p     2000 0      [anon_7f7742a48]
    0x7f7742a4a000     0x7f7742a72000 r--p    28000 0      /lib64/libc.so.6
    0x7f7742a72000     0x7f7742bdb000 r-xp   169000 28000  /lib64/libc.so.6
    0x7f7742bdb000     0x7f7742c33000 r--p    58000 191000 /lib64/libc.so.6
    0x7f7742c33000     0x7f7742c37000 r--p     4000 1e8000 /lib64/libc.so.6
    0x7f7742c37000     0x7f7742c39000 rw-p     2000 1ec000 /lib64/libc.so.6
    0x7f7742c39000     0x7f7742c43000 rw-p     a000 0      [anon_7f7742c39]
    0x7f7742c5f000     0x7f7742c61000 r--p     2000 0      /lib64/ld-linux-x86-64.so.2
    0x7f7742c61000     0x7f7742c87000 r-xp    26000 2000   /lib64/ld-linux-x86-64.so.2
    0x7f7742c87000     0x7f7742c92000 r--p     b000 28000  /lib64/ld-linux-x86-64.so.2
    0x7f7742c93000     0x7f7742c95000 r--p     2000 33000  /lib64/ld-linux-x86-64.so.2
    0x7f7742c95000     0x7f7742c97000 rw-p     2000 35000  /lib64/ld-linux-x86-64.so.2
    0x7fffecd94000     0x7fffecdb6000 rw-p    22000 0      [stack]
    0x7fffecdcf000     0x7fffecdd3000 r--p     4000 0      [vvar]
    0x7fffecdd3000     0x7fffecdd5000 r-xp     2000 0      [vdso]
0xffffffffff600000 0xffffffffff601000 --xp     1000 0      [vsyscall]

So, i'd expect the radare2 output to be simmilar to the one of pwndbg

Same binary was used as attached. its nothing more than a .c file with 3 mallocs and 3 frees.

Test

compile the attached file run with r2 -d a.out step through program until som mallocs/frees run any dmh* command

cfile.tgz

system

Linux localhost 5.15.41-gentoo-x86_64 #1 SMP Sat Jun 25 12:51:37 CEST 2022 x86_64 Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz GenuineIntel GNU/Linux

glibc

strings /lib64/libc.so.6 | grep libc.2 glibc 2.35

ghost commented 2 years ago

Also checked with the master branch. Same problem. also, e dbg.malloc = glibc has been set as well as e dbg.glibc.demangle = true and still same thing

trufae commented 2 years ago

Thanks for the detailed report, the dmh command isnt actively maintained or tested, it will be good to port the code from pwndbg to support last versions of glibc which i assume is the problem, as the heap structures change over time, would you like to help on that?

r2frida also have dmh support but i think its macos/ios specific as its not implemented on linux

ghost commented 2 years ago

@trufae Yeah absolutely. I was in the process of writing some heap inspection programs myself anyway since there is virtually none out there, and the few that are, are too old or dont work with various versions. Give me a basic rundow of what needs to be done basically and where I should start digging through the sources.

ghost commented 2 years ago

@trufae I have forked the repo and started fixing some of the issues I mentioned. I already fixed the issue with the thread arena prints. If you want to follow along or have any hints or comments, here is the repo link to my devel branch: https://github.com/majorendian/radare2/tree/g0zar_devel

ghost commented 2 years ago

I managed to get this working by manually finding the address of the main arena. Turns out radare2 gets the wrong address mapping for libc. Instead of fetching the data segment of libc, it fetches some random segment based on how they are mapped. Its possible to partially fix this by manually finding the main arena and then doing dmhm @ <actuall_address_of_main_arena> and then the output seems to look a lot more reasonable. But it periodically gets busted so i will still need to fix this.

Based on the structures present in r_heap_glibc.h it seems like the newest/up-to-date structures are present. So its very likely the bug is simply in fetching the wrong segment from the dm mappings for libc. The code in question:

    if (is_debugged) {
        RListIter *iter;
        RDebugMap *map;
        r_debug_map_sync (core->dbg);
        r_list_foreach (core->dbg->maps, iter, map) {
            /* Try to find the main arena address using the glibc's symbols. */
            if ((strstr (map->name, "/libc-") || strstr (map->name, "/libc."))
                    && first_libc && main_arena_sym == GHT_MAX) {
                first_libc = false;
                main_arena_sym = GH (get_main_arena_with_symbol) (core, map);
            }
            if ((strstr (map->name, "/libc-") || strstr (map->name, "/libc."))
                    && map->perm == R_PERM_RW) {
                libc_addr_sta = map->addr;
                libc_addr_end = map->addr_end;
                break;
            }
        }
    } else {

Some extra info, checked which map gets associated with the main arena. Its an 'unkn' map region right after the libc RW region (the correct one)

[0x7f90d46044a0]> dmha
main_arena @ 0x7f90d46044a0
thread arena @ 0x0
[0x7f90d46044a0]> dm.
0x00007f90d45ff000 - 0x00007f90d4609000 * usr    40K s rw- unk1 unk1
[0x7f90d46044a0]> dm
0x000055fbde276000 - 0x000055fbde277000 - usr     4K s r-- /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out ; loc.imp._ITM_registerTMCloneTable
0x000055fbde277000 - 0x000055fbde278000 - usr     4K s r-x /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out ; map._home_tino_Software_cyberdiskdir_cyber_trainning_research_experiments_first_fit_testing_c_a.out.r_x
0x000055fbde278000 - 0x000055fbde279000 - usr     4K s r-- /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out ; map._home_tino_Software_cyberdiskdir_cyber_trainning_research_experiments_first_fit_testing_c_a.out.r__
0x000055fbde279000 - 0x000055fbde27a000 - usr     4K s r-- /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out ; map._home_tino_Software_cyberdiskdir_cyber_trainning_research_experiments_first_fit_testing_c_a.out.rw_
0x000055fbde27a000 - 0x000055fbde27b000 - usr     4K s rw- /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out /home/tino/Software/cyberdiskdir/cyber/trainning/research/experiments/first_fit_testing/c/a.out ; obj._GLOBAL_OFFSET_TABLE_
0x000055fbdfdd3000 - 0x000055fbdfdf4000 - usr   132K s rw- [heap] [heap]
0x00007f90d440d000 - 0x00007f90d4410000 - usr    12K s rw- unk0 unk0
0x00007f90d4410000 - 0x00007f90d4438000 - usr   160K s r-- /lib64/libc.so.6 /lib64/libc.so.6
0x00007f90d4438000 - 0x00007f90d45a1000 - usr   1.4M s r-x /lib64/libc.so.6 /lib64/libc.so.6
0x00007f90d45a1000 - 0x00007f90d45f9000 - usr   352K s r-- /lib64/libc.so.6 /lib64/libc.so.6
0x00007f90d45f9000 - 0x00007f90d45fd000 - usr    16K s r-- /lib64/libc.so.6 /lib64/libc.so.6
0x00007f90d45fd000 - 0x00007f90d45ff000 - usr     8K s rw- /lib64/libc.so.6 /lib64/libc.so.6
0x00007f90d45ff000 - 0x00007f90d4609000 * usr    40K s rw- unk1 unk1
0x00007f90d4626000 - 0x00007f90d4628000 - usr     8K s r-- /lib64/ld-linux-x86-64.so.2 /lib64/ld-linux-x86-64.so.2
0x00007f90d4628000 - 0x00007f90d464e000 - usr   152K s r-x /lib64/ld-linux-x86-64.so.2 /lib64/ld-linux-x86-64.so.2 ; map._lib64_ld_linux_x86_64.so.2.r_x
0x00007f90d464e000 - 0x00007f90d4659000 - usr    44K s r-- /lib64/ld-linux-x86-64.so.2 /lib64/ld-linux-x86-64.so.2 ; map._lib64_ld_linux_x86_64.so.2.r__
0x00007f90d465a000 - 0x00007f90d465c000 - usr     8K s r-- /lib64/ld-linux-x86-64.so.2 /lib64/ld-linux-x86-64.so.2 ; map._lib64_ld_linux_x86_64.so.2.rw_
0x00007f90d465c000 - 0x00007f90d465e000 - usr     8K s rw- /lib64/ld-linux-x86-64.so.2 /lib64/ld-linux-x86-64.so.2 ; r15
0x00007ffd5d132000 - 0x00007ffd5d154000 - usr   136K s rw- [stack] [stack] ; map._stack_.rw_
0x00007ffd5d16e000 - 0x00007ffd5d172000 - usr    16K s r-- [vvar] [vvar] ; map._vvar_.r__
0x00007ffd5d172000 - 0x00007ffd5d174000 - usr     8K s r-x [vdso] [vdso] ; map._vdso_.r_x
0xffffffffff600000 - 0xffffffffff601000 - usr     4K s --x [vsyscall] [vsyscall] ; map._vsyscall_.__x
[0x7f90d46044a0]> 
trufae commented 2 years ago

Hey @majorendian sorry for my late reply.

So it seems like the problem is just a wrong detection of the region containing the glibc to find out the global. this code has changed a couple of times and the reason was mainly that on some distros this library was placed on a different path so it was impossible to have a 100%. and we should be adding tests for this, but that's not easy because we may probably use chroots inside privileged containers to do that in ghci to emulate different distros easily.

if you have some fixes for this issue or maybe a workaround feel free to submit a PR with the contents of your branch. Feel free to join irc, discord or telegram chats if you want a faster response. ive been busy this weekend O:)

thanks for digging in!

trufae commented 2 years ago

ping?

ghost commented 2 years ago

@trufae Sorry I have a few things going on at the moment. I might join the discord. Right now I only have a fix for the thread_arena spam.

trufae commented 1 year ago

can you send a PR with that fix? so the changes are not lost at least 👍

ghost commented 1 year ago

@trufae I submited a pull request for the thread_arena spam. Sorry for my late reply.