radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
20.29k stars 2.97k forks source link

aaa detects much less functions than Ghidra #21592

Closed yuzhichang closed 1 year ago

yuzhichang commented 1 year ago

Environment

Wed Apr 12 09:45:47 PM CST 2023 radare2 5.8.5 30228 @ linux-x86-64 git.5.8.4-111-g4968d69f18 commit: 4968d69f1848c2f9f2333d649a845009b59ba2a7 build: 2023-04-10__09:53:17 Linux x86_64

Description

Ghidra detects 15429 functions for stripped ELF 1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so, see the following attached: 1.41.1-310.csv

$ cat 1.41.1-310.csv | wc -l
15430

r2 aaa detects only 6644 functions.

For example, r2 missed 0x00167ab0, which happens to be the one I'm interested at. Ghidra fid analysis says it's grpc_chttp2_maybe_complete_recv_initial_metadata.

image

image

trufae commented 1 year ago

e anal.vars=false;e anal.hasnext=true;afr;aac is probably the best way to go. dont run aaaa and expect it to be good or fast. it all depends on the target and use cases. surely default analysis could be better, but that requires testing and maintainance, and i dont have much time for handling more things

trufae commented 1 year ago

can you share the sample?

yuzhichang commented 1 year ago

The stripped ELF 1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so is just the one I sent you(pancake@nowsecure.com) on 23 Mar.

e anal.vars=false;e anal.hasnext=true;afr;aac detects more functions including 0x00167ab0 but still miss some on which I'm interested in. Another drawback I noticed that afr doesn't analyze function arguments. I think arguments size matching could be very helpful at zbr and zb.

zhichyu@ck98:~/grpc_whl$ r2 pypi/1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so
WARN: run r2 with -e bin.cache=true to fix relocations in disassembly
 -- Use 'zoom.byte=printable' in zoom mode ('z' in Visual mode) to find strings
[0x00075100]> e anal.vars=false;e anal.hasnext=true;afr;aac
WARN: Cannot find basic block for switch case at 0x00405d8f bbdelta = 21
WARN: Cannot find basic block for switch case at 0x004054b4 bbdelta = 28
WARN: Cannot find basic block for switch case at 0x004051c4 bbdelta = 28
WARN: Cannot find basic block for switch case at 0x004049b1 bbdelta = 23
WARN: Cannot find basic block for switch case at 0x005aacd2 bbdelta = 20
WARN: Cannot find basic block for switch case at 0x00176d47 bbdelta = 71
WARN: Cannot find basic block for switch case at 0x001790c6 bbdelta = 35
WARN: Cannot find basic block for switch case at 0x005be384 bbdelta = 28
WARN: Cannot find basic block for switch case at 0x005c1f6e bbdelta = 33
[0x00075100]> aflc
14376
[0x00075100]> afi 0x00167ab0
#
offset: 0x00167ab0
name: fcn.00167ab0
size: 139
is-pure: false
realsz: 128
stackframe: 24
call-convention: amd64
cyclomatic-cost: 46
cyclomatic-complexity: 6
bits: 64
type: fcn [NEW]
num-bbs: 8
num-instrs: 29
edges: 10
minbound: 0x00167ab0
maxbound: 0x00167b3b
is-lineal: false
end-bbs: 2
call-refs: 0x00167b10 J 0x00167b06 J 0x00167b18 J 0x0017d380 C 0x002297d0 C 0x002ac350 C 0x00167ad5 J 0x002ac350 C 0x00167ad5 J
data-refs: 0x00000660 0x00000790 0x000007a0
noreturn: false
in-degree: 0
out-degree: 4
data-xrefs:
locals: 0
args: 0
diff: type: new[0x00075100]> aaa
INFO: Analyze all flags starting with sym. and entry0 (aa)
INFO: Analyze function calls (aac)
INFO: Analyze len bytes of instructions for references (aar)
INFO: Finding and parsing C++ vtables (avrr)
INFO: Type matching analysis for all functions (aaft)
INFO: Propagate noreturn information (aanr)
INFO: Use -AA or aaaa to perform additional experimental analysis
[0x00075100]> afi 0x00178c90
[0x00075100]> aflc
14934

Ghidra says 0x00178c90 is grpc_chttp2_header_parser_parse.

image

image

trufae commented 1 year ago

e anal.vars=false is the one in charge of analyzing the function arguments too. so skip that if you need them.

the warnings show some incorrect cases of jump table analysis, if this function is referenced from a call the aac should be finding it. There are some conservative checks too. but i'll try when i get the file.

you can zip it and attach it to the issue or send it to my pesonal email (pancake@nopcode.org) or upload it somewhere and send me the link or via telegram, matrix, discord, etc..

yuzhichang commented 1 year ago

@trufae I just sent the stripped ELF from yuzhichang@gmail.com to pancake@nopcode.org.

trufae commented 1 year ago

received now

trufae commented 1 year ago

looks like in 0x00167ab0 there's no code in that address. ive loaded the same file in ghidra and it says the same as r2. this offset is not part of any function. also ghidra is loading the binary in a different offset. so if you do r2 -B 0x100000 ... both offsets in ghidra and r2 are the same. so that works fine

Screenshot 2023-04-13 at 17 45 22

the function you are looking for grpc_chttp2_header_parser_parse is at 0x271590 and for me if i do aaa in r2 without any special flag i get the function analyzed and found the same way its shown in ghidra. so my guess is that you are analyzing a different file in ghidra

trufae commented 1 year ago
[0x00271590]> aflc
14934
[0x00271590]> e anal.hasnext
false
[0x00271590]>
yuzhichang commented 1 year ago

The mail title is "stripped ELF of #21592". Here's the file size and checksum:

zhichyu@ck98:~/grpc_whl$ ls -l pypi/1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so
-rwxr-xr-x 1 zhichyu eoi 8554128 Mar 17 13:13 pypi/1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so
zhichyu@ck98:~/grpc_whl$ md5sum pypi/1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so
2d2164047fdf0e73b272bc37cd10dcbc  pypi/1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so
zhichyu@ck98:~/grpc_whl$ r2 pypi/1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so
WARN: run r2 with -e bin.cache=true to fix relocations in disassembly
 -- Step through your seek history with the commands 'u' (undo) and 'U' (redo)
[0x00075100]> e anal.hasnext=true;afr;aac
WARN: Cannot find basic block for switch case at 0x00405d8f bbdelta = 21
WARN: Cannot find basic block for switch case at 0x004054b4 bbdelta = 28
WARN: Cannot find basic block for switch case at 0x004051c4 bbdelta = 28
WARN: Cannot find basic block for switch case at 0x004049b1 bbdelta = 23
WARN: Cannot find basic block for switch case at 0x005aacd2 bbdelta = 20
WARN: Cannot find basic block for switch case at 0x00176d47 bbdelta = 71
WARN: Cannot find basic block for switch case at 0x001790c6 bbdelta = 35
WARN: Cannot find basic block for switch case at 0x005be384 bbdelta = 28
WARN: Cannot find basic block for switch case at 0x005c1f6e bbdelta = 33
[0x00075100]> aflc
14376
[0x00075100]> afi 0x00167ab0
#
offset: 0x00167ab0
name: fcn.00167ab0
size: 139
is-pure: false
realsz: 128
stackframe: 24
call-convention: amd64
cyclomatic-cost: 46
cyclomatic-complexity: 6
bits: 64
type: fcn [NEW]
num-bbs: 8
num-instrs: 29
edges: 10
minbound: 0x00167ab0
maxbound: 0x00167b3b
is-lineal: false
end-bbs: 2
call-refs: 0x0017d380 C 0x002297d0 C 0x002ac350 C 0x002ac350 C 0x00167ad5 J
data-refs:
noreturn: false
in-degree: 0
out-degree: 4
data-xrefs:
locals: 1
args: 0
var int64_t var_fh @ rsp+0xf
diff: type: new
yuzhichang commented 1 year ago

@trufae Any update on this?

trufae commented 1 year ago

No updates, i just pulled the binary from the mail (again), and confirmed all the same answers i gave you before:

i wonder whats the actual concern now. do you want to know why its different? or which functions r2 is issing? or which functions ghidra is wrongly identifying them as functions?

giving an more detailed answer requires reviewing every single function. and r2 already complains about some jump tables with unlinked basic blocks, but i didnt tried ghidra or compared every single function to see whats the difference.

So i would say that:

yuzhichang commented 1 year ago

Thanks for clarifying why ghidra and r2 differ. My last concern is r2 already complains about some jump tables with unlinked basic blocks. Could r2 miss detecting some functions due to these complains?

trufae commented 1 year ago

It means that r2 is identifying less branch destinations on a jump table. And those are considered separate functions instead of bbs of the same function. So the result may be larger

trufae commented 1 year ago

I consider this issue as fixed. Its ok to close it? You can keep asking if you have questions

yuzhichang commented 1 year ago

Thanks for answering.