NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
52.25k stars 5.92k forks source link

XOR related optimizations causes errors in calling convention recovery #6723

Open Muqi-Zou opened 4 months ago

Muqi-Zou commented 4 months ago

When I decompile two binaries, one compiled with "O2" and another compiled with "O2 -fno-peephole2". In function evhttp_read_body, I note that the second parameter of the external call function evbuffer_drain in O2 binary misses. As shown below, the left side is the ghidra code from the "O2 -fno-peephole2" binary and the right side is from the "O2" binary image

After changing(patch) the different instructions from "XOR esi, esi" ("O2") to "mov esi,x0x0" ("O2 -fno-peephole2"), the 'evbuffer_drain' can be recovered correctly. image

I looked into the source code of ghidra and noted that the root cause is related to ActionActiveParam. Specifically, the data flow analysis function ancestorOpUse will hit the external call evbuffer_readln (address 0x105e0a), if the instruction "XOR esi, esi" exists. You can check how the nodes are emplace_back to varlist in onlyOpUse and find it stops at a varnode who is defined at 105e0a with CPUI_INDIRECT.

This error reminds me the bug I reported previously #6648. Both are because ghidra cannot properly handle the XOR instruction optimized by fpeephole2. An easy fix for both cases can be adding an extraRuleTrivialArith before ActionActiveParam. However, this easy fix may not be generic enough. Anyhow, I think these two bugs are good examples showing ghidra now cannot properly handle the XOR instructions.

Muqi-Zou commented 4 months ago

debug.zip binaries are included, O0 is the binary compiled with "O2 -fno-peephole2".