radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
20.35k stars 2.97k forks source link

Wrong disassembly (because of unreachable code; x86 32bit Windows) #8462

Closed vadimkotov closed 7 years ago

vadimkotov commented 7 years ago

The result of pdf command appears to be wrong

Attached is Windows 7's 32-bit explorer.exe - a binary in which I came across this issue.

Address of the function in question: 0x1042183

I load the binary in a pretty typical way:

$ r2 explorer.exe
...
0x01030efa]> aaa
...
[0x01030efa]> s 0x1042183
[0x01042183]> pdf
...

Consider the following conditional jump found in this function at the offset 0x10422e4

0x010422e4 0f859e020000 jne 0x1042588

However the target address of 0x1042588 haven't been decoded correctly:

  |||||||   0x01042586      0180837d1001   add dword [eax + 0x1107d83], eax
| ========< 0x0104258c      0f852bfeffff   jne 0x10423bd
| |||||||   0x01042592      3905cc240b01   cmp dword [0x10b24cc], eax  ; [0x13:4]=-1 ; 19

As you can see in the snippet, it decoded an instruction at 0x1042586, which is the wrong offset.

It happened because a chunk of the function from 0x1042566 to 0x1042587 seems to be unreachable (see. screenshot taken from binary ninja).

unreachable-code-binja

From the screenshot you can see that the instruction at 0x1042561 is jmp, so the next address, which is 0x1042566 is never reached, the next reachable address is 0x1042588 (it's jumped to from 0x10422e4 as shown in the second snippet).

It should have skipped the bytes at 0x1042566 to 0x1042587 and carried on the disassembly at the correct offset - 0x1042588.

I can only conjecture that radare2 is not using traditional recursive traversal, but rather an altered version to, perhaps, save time or avoid deep recursion. Is this conjecture correct?

explorer.zip

PS I used the freshest version of radare2 updated from github

If you guys could point me to the code responsible for function analysis I might be able to fix it or at least debug the issue in more detail. Thanks!

vadimkotov commented 7 years ago

Tested with IDA, it handles this function correctly.

I do understand that sometimes the situations like this are unavoidable or way too expensive to handle properly (e.g. in obfuscated code specifically designed to confuse disassemblers), but in this case it is genuinely unreachable code.

Maijin commented 7 years ago

| pdf disassemble function

Did you try with ?:

| pdr recursive disassemble across the function graph

vadimkotov commented 7 years ago

Yep, that worked a treat. Didn't know there were multiple disassembly algorithms and that default was not recursive. Thanks. False alarm then.