rizinorg / rizin

UNIX-like reverse engineering framework and command-line toolset.
https://rizin.re
GNU Lesser General Public License v3.0
2.7k stars 361 forks source link

DWARF: failure to parse ARM structure types #3929

Closed XVilka closed 12 months ago

XVilka commented 1 year ago
rizin -e asm.cpu=cortexm test_app2.elf
[0x00008080]> aaa
[0x00008080]> afv @ dbg.main
var int16_t var_38h @ stack - 0x38
var int16_t var_34h @ stack - 0x34
var int16_t var_30h @ stack - 0x30
var int16_t var_2ch @ stack - 0x2c
var int16_t var_1ch @ stack - 0x1c
var unknown_t *s @ stack - 0x1a
arg int argc @ r0
arg char **argv @ r1
var unknown_t *gg @ r2
var float a @ ...
var float b @ ...
var double c @ ...
[0x000080f4]> afv @ dbg.fn1
arg int a @ r0
arg char *g @ r1
arg unknown_t *gg @ r2
arg unknown_t **out @ r3
arg double q @ composite: [(.0, 32): s0, (.0, 32): s1]

Here unknown_t for gg should be struct Some, also same for out, see the file source for details.

See also new_some function:

[0x0000813c]> pdf @ dbg.new_some
            ; CALL XREFS from dbg.main @ 0x817c, 0x8184
            ;-- new_some:
╭ dbg.new_some();
│           ; var unknown_t *n @ r4
│           0x0000813c      10b5           push  {r4, lr}
│           0x0000813e      1420           movs  r0, 0x14              ; size_t size
│           0x00008140      00f084f8       bl    malloc                ; sym.malloc ; void *malloc(size_t size)
│           0x00008144      0446           mov   r4, r0
│           0x00008146      1422           movs  r2, 0x14              ; size_t n
│           0x00008148      0021           movs  r1, 0                 ; int c
│           0x0000814a      00f087f8       bl    memset                ; sym.memset ; void *memset(void *s, int c, size_t n)
│           0x0000814e      2046           mov   r0, r4
╰           0x00008150      10bd           pop   {r4, pc}
rizin -e asm.cpu=cortexm test_app2.elf
[0x00008080]> aaa
[0x00008410]> pdf @ sym.strncpy
            ; CALL XREF from dbg.fn1 @ 0x8118
╭ char *strncpy(char *dest, const char *src, size_t  n);
│           ; arg char *dest @ r0
│           0x00008410      10b5           push  {r4, lr}
│           0x00008412      0139           subs  r1, 1
│           0x00008414      0346           mov   r3, r0                ; dest
│      ╭╭─> 0x00008416      32b1           cbz   r2, 0x8426
│      │╎   0x00008418      11f8014f       ldrb  r4, [r1, 1]!
│      │╎   0x0000841c      03f8014b       strb  r4, [r3], 1
│      │╎   0x00008420      013a           subs  r2, 1
│      │╎   0x00008422      002c           cmp   r4, 0
│      │╰─< 0x00008424      f7d1           bne   0x8416
│      ╰──> 0x00008426      1a44           add   r2, r3
│           0x00008428      0021           movs  r1, 0
│           ; CODE XREF from sym.strncpy @ 0x8434
│       ╭─> 0x0000842a      9342           cmp   r3, r2
│      ╭──< 0x0000842c      00d1           bne   0x8430
│      │╎   0x0000842e      10bd           pop   {r4, pc}
│      ╰──> 0x00008430      03f8011b       strb  r1, [r3], 1
╰       ╰─< 0x00008434      f9e7           b     0x842a

Looks like src and n arguments are missing from the afv output.

rizin -e asm.cpu=cortexm test_app2.elf
[0x00008080]> aaa
[0x00008080]> pdf @ sym.__do_global_dtors_aux
zsh: segmentation fault  rizin -e asm.cpu=cortexm test_app2.elf
rizin -e asm.cpu=cortexm test_app2.elf
[0x00008080]> aaa
[0x00008080]> aaa
[rizin(10895,0x1dff6d300) malloc: Heap corruption detected, free list is damaged at 0x6000009ba9d0
*** Incorrect guard value: 0
rizin(10895,0x1dff6d300) malloc: *** set a breakpoint in malloc_error_break to debug
zsh: abort      rizin -e asm.cpu=cortexm test_app2.elf

I recommend creating tests for these, especially types. You can use this file in the testbins

test_app2.zip

imbillow commented 1 year ago

In DWARF there is only a declaration for strncpy, so in rizin it is named sym.strncpy, which means that it does not come from debug information.

The same for sym.__do_global_dtors_aux

imbillow commented 1 year ago

Another point of confusion is that this command apparently behaves correctly when run.

test.rz:

e asm.cpu=cortexm
o test_app2.elf
aaa
pdf @ dbg.new_some
rizin -qi test.rz
ERROR: Cannot determine entrypoint, using 0x00008080.
            ; CALL XREFS from dbg.main @ 0x817c, 0x8184
            ;-- new_some:
┌ some_t * new_some()
│           ; var struct Some *n @ r4
│           0x0000813c      push  {r4, lr}                             ; some_t * new_some()

But in another way the wrong behavior occurs.

> rizin -e asm.cpu=cortexm test_app2.elf
ERROR: Cannot determine entrypoint, using 0x00008080.
 -- Change the registers of the child process in this way: 'dr eax=0x333'
[0x00008080]> aaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze function calls
[x] find and analyze function preludes
[x] Analyze len bytes of instructions for references
[x] Check for classes
[x] Finding xrefs in noncode section with analysis.in=io.maps
[x] Analyze value pointers (aav)
[x] Value from 0x00000000 to 0x00009780 (aav)
[x] 0x00000000-0x00009780 in 0x0-0x9780 (aav)
[x] Emulate functions to find computed references
[x] Analyze local variables and arguments
[x] Type matching analysis for all functions
[x] Applied 0 FLIRT signatures via sigdb
[x] Propagate noreturn information
[x] Integrate dwarf function information.
[x] Resolve pointers to data sections
[x] Use -AA or aaaa to perform additional experimental analysis.
[0x00008080]> afv @ dbg.main
var int16_t var_38h @ stack - 0x38
var int16_t var_34h @ stack - 0x34
var int16_t var_30h @ stack - 0x30
var int16_t var_2ch @ stack - 0x2c
var int16_t var_1ch @ stack - 0x1c
var unknown_t *s @ stack - 0x1a
arg int argc @ r0
arg char **argv @ r1
var unknown_t *gg @ r2
var float a @ ...
var float b @ ...
var double c @ ...
[0x00008080]> pdf @ dbg.new_some
            ; CALL XREFS from dbg.main @ 0x817c, 0x8184
            ;-- new_some:
┌ dbg.new_some();
│           ; var unknown_t *n @ r4
│           0x0000813c      push  {r4, lr}
XVilka commented 1 year ago

@imbillow could be some memory corruption or uninitialized variable?

imbillow commented 1 year ago

@imbillow could be some memory corruption or uninitialized variable?

But asan didn't detect it.

XVilka commented 1 year ago

Reopening because of pdf @ sym.__do_global_dtors_aux crash

imbillow commented 1 year ago

@XVilka I'm running

e asm.cpu=cortexm
aaa
pdf @ sym.__do_global_dtors_aux
aaa

and it's not crashing. What specific commands are you running.

XVilka commented 1 year ago

@XVilka I'm running

e asm.cpu=cortexm
aaa
pdf @ sym.__do_global_dtors_aux
aaa

and it's not crashing. What specific commands are you running.

This is what I get in LLDB:

ℤ lldb rizin
(lldb) target create "rizin"
Current executable set to 'rizin' (arm64).
(lldb) run test_app2.elf
Process 29218 launched: '/Users/anton.kochkov/.local/bin/rizin' (arm64)
ERROR: Cannot determine entrypoint, using 0x00008080.
 -- Use rz-run to launch your programs with a predefined environment.
[0x00008080]> e asm.cpu=cortexm
[0x00008080]> aaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze function calls
[x] find and analyze function preludes
[x] Analyze len bytes of instructions for references
[x] Check for classes
[x] Finding xrefs in noncode section with analysis.in=io.maps
[x] Analyze value pointers (aav)
[x] Value from 0x00000000 to 0x00009780 (aav)
[x] 0x00000000-0x00009780 in 0x0-0x9780 (aav)
[x] Emulate functions to find computed references
[x] Analyze local variables and arguments
[x] Type matching analysis for all functions
[x] Applied 0 FLIRT signatures via sigdb
[x] Propagate noreturn information
[x] Integrate dwarf function information.
[x] Resolve pointers to data sections
[x] Use -AA or aaaa to perform additional experimental analysis.
[0x00008080]> pdf @ sym.__do_global_dtors_aux
Process 29218 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000188cd2dc4 libsystem_platform.dylib`_platform_strlen + 4
libsystem_platform.dylib`:
->  0x188cd2dc4 <+4>:  ldr    q0, [x1]
    0x188cd2dc8 <+8>:  adr    x3, #-0xc8                ; ___lldb_unnamed_symbol282
    0x188cd2dcc <+12>: ldr    q2, [x3], #0x10
    0x188cd2dd0 <+16>: and    x2, x0, #0xf
Target 0: (rizin) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000188cd2dc4 libsystem_platform.dylib`_platform_strlen + 4
    frame #1: 0x0000000188b41ca4 libsystem_c.dylib`strdup + 28
    frame #2: 0x00000001007b15d0 librz_parse.0.7.dylib`rz_parse_filter at filter.c:313:18 [opt]
    frame #3: 0x00000001007b1444 librz_parse.0.7.dylib`rz_parse_filter(p=0x0000600001260100, addr=<unavailable>, f=<unavailable>, hint=0x0000600000e78000, data="\U0000001b[38;2;229;233;240mldr\U0000001b[0m\U0000001b[38;2;145;177;208m   \U0000001b[0m\U0000001b[38;2;104;224;226mr0\U0000001b[0m\U0000001b[38;2;145;177;208m, [\U0000001b[0m\U0000001b[38;2;163;190;151m", str="\U0000001b[38;2;136;208;240mcbz\U0000001b[0m\U0000001b[38;2;145;177;208m   \U0000001b[0m\U0000001b[38;2;104;224;226mr3\U0000001b[0m\U0000001b[38;2;145;177;208m, \U0000001b[0m\U0000001b[38;2;163;190;151m0x804a\U0000001b[0m", len=1024, big_endian=<unavailable>) at filter.c:590:2 [opt]
    frame #4: 0x000000010243534c librz_core.0.7.dylib`ds_build_op_str(ds=0x000000011e025200, print_color=<unavailable>) at disasm.c:1045:3 [opt]
    frame #5: 0x0000000102430f28 librz_core.0.7.dylib`rz_core_print_disasm(core=<unavailable>, addr=32824, buf=<unavailable>, len=24, nlines=<unavailable>, state=0x000000016fdfe530, options=<unavailable>) at disasm.c:5480:4 [opt]
    frame #6: 0x000000010248462c librz_core.0.7.dylib`rz_cmd_disassembly_function_handler(core=0x000000011d809a00, argc=<unavailable>, argv=<unavailable>, state=0x000000016fdfe530) at cmd_print.c:3989:21 [opt]
    frame #7: 0x00000001024a3324 librz_core.0.7.dylib`rz_cmd_call_parsed_args at cmd_api.c:766:21 [opt]
    frame #8: 0x00000001024a32e0 librz_core.0.7.dylib`rz_cmd_call_parsed_args [inlined] call_cd(cmd=<unavailable>, cd=0x0000600001748930, args=0x0000600003cec280) at cmd_api.c:803:10 [opt]
    frame #9: 0x00000001024a32e0 librz_core.0.7.dylib`rz_cmd_call_parsed_args(cmd=<unavailable>, args=0x0000600003cec280) at cmd_api.c:821:9 [opt]
    frame #10: 0x0000000102491af0 librz_core.0.7.dylib`handle_ts_arged_stmt [inlined] handle_ts_arged_stmt_internal(state=0x000000016fdfea08, node=TSNode @ 0x000000016fdfe5c0, node_string=<unavailable>) at cmd.c:3693:8 [opt]
    frame #11: 0x000000010249176c librz_core.0.7.dylib`handle_ts_arged_stmt(state=0x000000016fdfea08, node=<unavailable>) at cmd.c:3640:1 [opt]
    frame #12: 0x000000010249dbe0 librz_core.0.7.dylib`handle_ts_stmt(state=0x000000016fdfea08, node=TSNode @ 0x000000016fdfe7f0) at cmd.c:5183:9 [opt]
    frame #13: 0x0000000102492f8c librz_core.0.7.dylib`handle_ts_tmp_seek_stmt [inlined] handle_ts_stmt_tmpseek(state=0x000000016fdfea08, node=<unavailable>) at cmd.c:5200:20 [opt]
    frame #14: 0x0000000102492f6c librz_core.0.7.dylib`handle_ts_tmp_seek_stmt [inlined] handle_ts_tmp_seek_stmt_internal(state=0x000000016fdfea08, node=TSNode @ 0x000000016fdfe770, node_string=<unavailable>) at cmd.c:4037:20 [opt]
    frame #15: 0x0000000102492e80 librz_core.0.7.dylib`handle_ts_tmp_seek_stmt(state=0x000000016fdfea08, node=<unavailable>) at cmd.c:4019:1 [opt]
    frame #16: 0x000000010249dbe0 librz_core.0.7.dylib`handle_ts_stmt(state=0x000000016fdfea08, node=TSNode @ 0x000000016fdfe970) at cmd.c:5183:9 [opt]
    frame #17: 0x000000010249155c librz_core.0.7.dylib`handle_ts_statements at cmd.c:5240:25 [opt]
    frame #18: 0x0000000102491464 librz_core.0.7.dylib`handle_ts_statements(state=0x000000016fdfea08, node=TSNode @ 0x000000016fdfead0) at cmd.c:5205:1 [opt]
    frame #19: 0x0000000102497ff8 librz_core.0.7.dylib`core_cmd_tsrzcmd(core=0x000000011d809a00, cstr=<unavailable>, split_lines=<unavailable>, log=<unavailable>) at cmd.c:5351:9 [opt]
    frame #20: 0x000000010245aea4 librz_core.0.7.dylib`rz_core_cmd(core=<unavailable>, cstr=<unavailable>, log=1) at cmd.c:5399:27 [opt]
    frame #21: 0x0000000102423d68 librz_core.0.7.dylib`rz_core_prompt_loop [inlined] rz_core_prompt_exec(r=0x000000011d809a00) at core.c:2770:12 [opt]
    frame #22: 0x0000000102423d58 librz_core.0.7.dylib`rz_core_prompt_loop(r=0x000000011d809a00) at core.c:2640:14 [opt]
    frame #23: 0x000000010049fd30 librz_main.0.7.dylib`rz_main_rizin(argc=<unavailable>, argv=0x000000016fdff018) at rizin.c:1437:3 [opt]
    frame #24: 0x000000018892d058 dyld`start + 2224
(lldb)