NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
49.09k stars 5.65k forks source link

Decompiler: Simplify comparisons between `INT_OR` and zero. #6578

Open LukeSerne opened 1 month ago

LukeSerne commented 1 month ago

At optimisation level -O1, gcc combines several values that all need to be compared against zero by combining them using INT_OR and only comparing the combined result against zero. With this rule, the decompiler is able to break these INT_OR chains apart and simplify the individual links.

Example

As an example, let's compile the below source code with gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 (or follow along on godbolt). Make sure to pass -O1.

#include <stdio.h>

int a = 0;
int b = 0;

int main() {
    if ((a == 0) && (b == 0)) {
        printf("zero!\n");
    }
    return 0;
}

If using godbolt, you can notice that an or instruction is used and only one jump instruction. If you're compiling locally, you can load the resulting binary in ghidra and let auto analysis run. I tested with Github release 11.0.3. The decompiler gives the following output:

undefined8 main(void)

{
  if ((a | b) != 0) {
    return 0;
  }
  puts("zero!");
  return 0;
}

Here, you can also see the binary or operator (|) being used. Together with the != 0 condition, it's not obvious at first glance that this tests if either a or b is nonzero. Additionally, in this case a and b are variables that cannot be simplified further. The expressions being or-red together might be more complex, and this structure hinders further simplification.

We can test this PR by compiling decomp_dbg and using this xml that I generated using the "Debug function decompilation" menu item. Then, we can use the command line interface decomp_dbg to see what the decompiled code would look like with this patch:

[decomp]> restore /path/to/int_or_zero.xml                          
/path/to/int_or_zero.xml successfully loaded: Intel/AMD 64-bit x86
[decomp]> load function main
Function main: 0x00101149
[decomp]> decompile
Decompiling main
Decompilation complete
[decomp]> print C

undefined8 main(void)

{
  if (a != 0 || b != 0) {
    return 0;
  }
  puts("zero!");
  return 0;
}