NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
52.26k stars 5.92k forks source link

Don't change conditions to equivalent expressions during decompilation #558

Open pglizniewicz opened 5 years ago

pglizniewicz commented 5 years ago

Changing conditions to equivalent ones generates a decompiler output which is harder to understand. For example this:

                         LAB_6000207c
    6000207c 83 f9 08        CMP        ECX,0x8
    6000207f 7f 0f           JG         LAB_60002090
    60002081 83 fa 08        CMP        EDX,0x8
    60002084 7e 05           JLE        LAB_6000208b
    60002086 b8 10 27        MOV        result,0x2710
             00 00
                         LAB_6000208b
    6000208b 83 f9 08        CMP        ECX,0x8
    6000208e 7e 08           JLE        LAB_60002098
                         LAB_60002090
    60002090 83 fa 08        CMP        EDX,0x8
    60002093 7f 03           JG         LAB_60002098
    60002095 c1 e0 02        SHL        result,0x2

was decompiled to:

if (iVar1 < 9) {
  if (8 < bpp) {
    result = 10000;
  }
  if (iVar1 < 9) {
    return result;
  }
}
if (bpp < 9) {
  result = result << 2;
}

}

In assembly, the variable was compared to the constant using less-equal, but in the decompile it's compared to the constant+1 using less. It's correct, but the resulting decompile is harder to understand. This code dealt with setting bit-depth of the screen and while a 8-bit bit-depth was meaningful, 9-bit bit-depth feels out of place in this context.

Ghidra should either use the same operation as in assembly or allow us to choose it via something like the "Convert" option for constants in the "Listing" view.

jtwine commented 5 years ago

As another point on clarity/consistency, similar compare instructions decompile into expressions with differing ordering of their operands. In the above example, the first and last CMP EDX have swapped operands in the decompile - one puts the constant on the left of the expression, the other on the right.

There are some that believe any constant of an "if" expression should be on the left, as it helps catch accidental use of assignment instead of equality. But either way, the decompiled output above is inconsistent.

(Figured it might be better to camp onto this bug instead of essentially duplicating most of it.)

aamcintosh commented 5 years ago

Here is a similar issue. Consider the following: typedef struct { unsigned int n; unsigned int **r; }S; extern void init(unsigned int *x); void foo(S *p) { int i; for(i=0; i<p->n; i++) init(p->r[i]); } On Ubuntu 16.04 X86 64 bit, gcc 5.4 -O3 turns the loop into (this from objdump): 18: 49 8b 44 24 08 mov 0x8(%r12),%rax 1d: 83 c3 01 add $0x1,%ebx 20: 48 8b 3c 28 mov (%rax,%rbp,1),%rdi 24: 48 83 c5 08 add $0x8,%rbp 28: e8 00 00 00 00 callq 2d <foo+0x2d> 2d: 41 39 1c 24 cmp %ebx,(%r12) 31: 77 e5 ja 18 <foo+0x18> (The compiler turns the loop into if(p->n!=0)do{...}while(i<p->n);) The Ghidra 9.0.1 decompiler turns the condition of the test/branch at the bottom of the loop into uVar2 <= *puParm1 && *puParm1 != uVar2 (I've dropped the casts.) This amounts to turning the loop condition into i<=p->n && p->n != i. (Here uVar2 is i and puParam1 is p.)

I had to stare at this for a while to convince myself that it was correct.

MrSapps commented 5 years ago

I think a better way to handle this is to allow interactively inverting the if conditions.

MrSapps commented 5 years ago
There are some that believe any constant of an "if" expression should be on the left, as it helps catch accidental use of assignment instead of equality. But either way, the decompiled output above is inconsistent.

I think this should never be done, any modern compiler flags this as an error. Yoda conditions are less readable. Also since this is decompiler output it won't get it wrong anyway...

gyohng commented 6 months ago

I agree, please change -1 < voltage into voltage > -1

tahg commented 2 months ago

I agree that this is still an issue. (Ghidra 11.1)