NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
50.5k stars 5.77k forks source link

Variable increment is incorrectly decompiled #6820

Open ambergorzynski opened 3 weeks ago

ambergorzynski commented 3 weeks ago

Describe the bug There appears to be an odd problem in decompiling a variable increment. This problem only arises when the surrounding program is somewhat complex, so I assume the underlying issue is caused by something else in the program. I have attached the simplest example I could create (although it’s still a somewhat convoluted program).

The problem is that an index variable should be incremented by 1, however it is decompiled to an increment of 2. This causes the control flow of the program to be incorrect.

You can observe the incorrect variable increment on line 37 of my example; this should be incremented by 1 instead of 2. The corresponding assembly is highlighted in green using Ghidra's highlight functionality.

Screenshot 2024-08-16 at 12 04 06

To Reproduce I attach here the simplest example I could identify to illustrate the issue: prog_incorrect_increment.o.zip

  1. Import example_incorrect_increment.o to Ghidra
  2. Analyse and decompile the object file and observe the error in line 37 of the decompiled file

Additional context To illustrate the problem I also include the original program and the Ghidra decompiler output in a recompilable form here: example_incorrect_increment.zip

I partially compile program.cpp using g++ (version 11.4.0) to an x86-64 object file on a Linux machine running Ubuntu 22.04. I then use Ghidra to decompile the object file and export the decompiled output using the analyzeHeadless functionality. I make one minor edit to the program to enable it to recompile by defining __stack_chk_fail(), and save the resulting program as decompiled.c. This can be used to illustrate the issue as follows:

  1. Unzip the attached folder
  2. Run example.sh in a terminal
  3. Observe the output from the original program and compare with the output from the recompiled Ghidra output   Original output:        0, 1, 17, -1, -1, -1, -1, -1   Recompiled output: 0, 1, 17, 5, 1, 17, -1, -1

Changing line 159 in decompiled.c from iVar2 = local_10 + 2; to iVar2 = local_10 + 1 rectifies the issue and results in correct output.

Some interesting observations:

So based on this I think the underlying problem could be related to the switch statements, which are indeed pretty weirdly shaped. But it seems a bit odd that this causes a problem with a simple variable increment.

Expected behavior For prog_incorrect_increment.o, line 37 should increment by 1 instead of 2.

Environment:

LukeSerne commented 3 weeks ago

I extended the provided source code to also print the direction values it was reading, and at which index, for both the original source code and the decompiled code. This made it more obvious what's going wrong. In fact, the recompiled code is reading out-of-bounds of the directions array. It appears that this 17 is coming from the output array.

Original:
switch 1: 0 @ 0
switch 2: 5 @ 1
output: 0 1 17 -1 -1 -1 -1 -1 

Ghidra:
switch 1: 0 @ 0
switch 2: 0 @ 2
switch 1: 0 @ 2
switch 2: 17 @ 4
output: 0 1 17 5 1 17 -1 -1

It seems that the "problematic" INT_ADD(..., #0x2:4) is created by an application of the decompiler rule addmultcollapse:

-0x001000d2:196: EDX(0x001000d2:196) = EDX(0x0010005d:93) + #0x1:4
+0x001000d2:196: EDX(0x001000d2:196) = s0xfffffffffffffff0:4(0x0010003a:316) + #0x2:4

The application of this rule propagates the definition of EDX(0x0010005d:93), which is s0xfffffffffffffff0:4(0x0010003a:316) + #0x1:4. The varnode s0xfffffffffffffff0:4(0x0010003a:316) corresponds to the local_10 variable in the final C code, and to the dir_counter variable in the original C++ code. So while this + 2 might seem wrong, it's correct according to the state of the PCode at that point, since the dir_counter variable did not get updated after it was incremented in the block_1 block.

The added debug prints to the recompiled code make it clear that the index is incremented by 2 between the first and the second read from the directions array. Notice that in the original source code, there are also 2 increments happening, but the second read happens just before the second increment (so at offset +1). In the decompiled code, the second read happens just after the second increment. In other words, the problem isn't so much with the + 2, but instead, the problem is that the resulting value is used to index into the directions array to load the value that the switch statement is based on.

Looking at the final, optimised PCode by the decompiler, we see that this ordering is correct - the LOAD occurs at 0x001000e1:1cb, using EDX(0x0010005d:93) as index. This varnode in turn is defined as s0xfffffffffffffff0:4(0x0010003a:328) + #0x1:4, which basically means dir_counter + 1. Here's a snippet of the final PCode, with irrelevant parts omitted:

...
Basic Block 1 0x0010003a-0x00100071
0x0010003a:319: u0x100000af:4(0x0010003a:319) = u0x1000009f:4(0x00100034:32e) ? u0x100000a3:4(0x0010007f:32f) ? u0x100000a7:4(0x001000ac:330) ? u0x100000ab:4(0x00100108:331)
0x0010003a:316: u0x10000087:4(0x0010003a:316) = u0x10000077:4(0x00100034:324) ? u0x1000007b:4(0x0010007f:325) ? u0x1000007f:4(0x001000ac:326) ? u0x10000083:4(0x00100108:327)
0x0010003a:332: s0xfffffffffffffff4:4(0x0010003a:332) = u0x100000af:4(0x0010003a:319)
0x0010003a:328: s0xfffffffffffffff0:4(0x0010003a:328) = u0x10000087:4(0x0010003a:316)
0x00100041:55: EDX(0x00100041:55) = s0xfffffffffffffff4:4(0x0010003a:332) + #0x1:4
0x00100047:5b: RAX(0x00100047:5b) = SEXT48(s0xfffffffffffffff4:4(0x0010003a:332))
0x0010004d:31e: RAX(0x0010004d:31e) = RDI(i) + RAX(0x00100047:5b)(*#0x4)
0x00100050:8b: *(ram,RAX(0x0010004d:31e)) = #0x1:4
0x0010005d:93: EDX(0x0010005d:93) = s0xfffffffffffffff0:4(0x0010003a:328) + #0x1:4
0x00100063:99: RAX(0x00100063:99) = SEXT48(s0xfffffffffffffff0:4(0x0010003a:328))
0x00100069:31f: RAX(0x00100069:31f) = RSI(i) + RAX(0x00100063:99)(*#0x4)
0x0010006c:c8: u0x0000d680:4(0x0010006c:c8) = *(ram,RAX(0x00100069:31f))
0x0010006e:d0: ZF(0x0010006e:d0) = u0x0000d680:4(0x0010006c:c8) == #0x3:4
0x00100071:d5: goto r0x0010010b:1(free) if (ZF(0x0010006e:d0) != 0)
...
Basic Block 7 0x001000af-0x00100108
0x001000b6:158: EDX(0x001000b6:158) = s0xfffffffffffffff4:4(0x0010003a:332) + #0x2:4
0x001000bc:15e: RAX(0x001000bc:15e) = SEXT48(EDX(0x00100041:55))
0x001000c2:321: RAX(0x001000c2:321) = RDI(i) + RAX(0x001000bc:15e)(*#0x4)
0x001000c5:18e: *(ram,RAX(0x001000c2:321)) = #0x11:4
0x001000d2:196: EDX(0x001000d2:196) = s0xfffffffffffffff0:4(0x0010003a:328) + #0x2:4
0x001000d8:19c: RAX(0x001000d8:19c) = SEXT48(EDX(0x0010005d:93))
0x001000de:322: RAX(0x001000de:322) = RSI(i) + RAX(0x001000d8:19c)(*#0x4)
0x001000e1:1cb: u0x0000d680:4(0x001000e1:1cb) = *(ram,RAX(0x001000de:322))
0x00100108:327: u0x10000083:4(0x00100108:327) = EDX(0x001000d2:196)
0x00100108:32b: u0x10000093:4(0x00100108:32b) = u0x10000083:4(0x00100108:327)
0x00100108:331: u0x100000ab:4(0x00100108:331) = EDX(0x001000b6:158)
0x00100108:335: u0x100000bb:4(0x00100108:335) = u0x100000ab:4(0x00100108:331)
0x00100108:1f0: switch u0x0000d680:4(0x001000e1:1cb)
...

So, the problem does not seem to be with optimising the PCode, but instead with rendering the PCode as C. My guess is that the C printer calculates the "guard" for the switch to be param_2[iVar2] (which is correct just a few statements earlier, at the start of block_17), but because iVar2 is updated just after the start of block_17, it is no longer correct at the switch statement. This confusion probably happens because of the complex control flow before the second switch is reached.


After some playing around with the example code, I managed to get a less complicated function that still causes the same bug - the second read in the decompiled code happens at index 2 instead of 1. It seems that the second read doesn't need to happen in a switch statement. However, the switch statement cannot be changed into a small series of if-statements. The while loop and extra return cases on the switch are also necessary, to keep the control flow sufficiently complex.

void f(int * nums) {
    int i = 0;

    while (1) {
        switch (nums[i++]) {
            case 0: {
                if (nums[i++] == 0)
                    break;
            }
            case 1: return;
            case 2: return;
        }
    }
}

After compiling with gcc 11.4.0 and loading the resulting binary in ghidra, we find that the decompiled C code looks like this:


void f(int *param_1)

{
  int iVar1;
  int iVar2;
  int local_c;

  local_c = 0;
  iVar1 = local_c;
  do {
    while( true ) {
      do {
        local_c = iVar1;
        iVar1 = local_c + 1;
        iVar2 = param_1[local_c];
        if (iVar2 == 2) {
          return;
        }
      } while (2 < iVar2);
      if (iVar2 != 0) break;
      iVar1 = local_c + 2;
      if (param_1[iVar1] != 0) {
        return;
      }
    }
  } while (iVar2 != 1);
  return;
}

Here, local_c + 2 is used to index into the param_1 buffer for the second read, which is wrong. local_c + 1 should be used instead. I have attached a function debug .xml file for this function: postincrement_wrong_order.xml