NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.9k stars 5.89k forks source link

Decompiler hangs in certain cases when a sub-datatype does not fit #6991

Closed RootCubed closed 1 month ago

RootCubed commented 1 month ago

Describe the bug If a structure has a sub-datatype that does not fit (this can happen when editing the sub-datatype and forgetting to update the datatype it's contained in), the decompiler gets stuck and does not decompile the function.

To Reproduce

  1. Download the reproduction sample from the attachment and import the toobig_ptrsubundo.o file into Ghidra.
  2. Edit the datatype C and:
  3. Uncheck the "pack" option.
  4. Go to the datatype A and add a new field int y.

The decompiler will now no longer be able to decompile the function.

Expected behavior The decompiler should not hang, like it used to in previous versions of Ghidra.

Screenshots image

Attachments toobig_ptrsubundo.zip

Environment (please complete the following information):

Additional context

This is a regression of 34adcff.

My explanation as to why this is happening - consider the following setup:

Structure A:
  offset 0x0: int x;
  offset 0x4: int y;

Structure B:
  offset 0x0: int z1;
  offset 0x4: int z2;

Structure C:
  offset 0x0: A a1; [Does not fit! The UI shows "TooBig: A needs 8 has 4"]
  offset 0x4: A a2; [Does not fit! The UI shows "TooBig: A needs 8 has 4"]
  offset 0x8: B b;

and assume we have a variable c that is of type C *.

Now, the optimization that splits up pointer offsets into multiple CPUI_PTRSUBs says that the offset ((void *) c) + 12) should be treated as ((void *) c) + 8) + 4) (so c->b.z2, which would be correct). However, RulePtrsubUndo then tests where the +8 should point to. In this case, TypeStruct::getFieldIter actually says that +8 should be ->a2 + 4 instead of ->b, which makes RulePtrsubUndo think that the CPUI_PTRSUB is incorrect and reverts it back. Then, the decompiler later tries to add back the CPUI_PTRSUBs and this continues back-end-forth forever.

A solution could be to correct this case in TypeStruct::getFieldIter, e.g. as follows:

diff --git a/Ghidra/Features/Decompiler/src/decompile/cpp/type.cc b/Ghidra/Features/Decompiler/src/decompile/cpp/type.cc
index d278c19dc8..e23f84a775 100644
--- a/Ghidra/Features/Decompiler/src/decompile/cpp/type.cc
+++ b/Ghidra/Features/Decompiler/src/decompile/cpp/type.cc
@@ -1583,8 +1583,11 @@ int4 TypeStruct::getFieldIter(int4 off) const
     if (curfield.offset > off)
       max = mid - 1;
     else {         // curfield.offset <= off
-      if ((curfield.offset + curfield.type->getSize()) > off)
-   return mid;
+      if ((curfield.offset + curfield.type->getSize()) > off) {
+        if (mid + 1 <= max && field[mid + 1].offset <= off)
+          return mid + 1;
+        return mid;
+      }
       min = mid + 1;
     }
   }

However, maybe it additionally makes sense to output a warning somewhere that the struct is not well-formed and could thus lead to issues (or directly throw a decompiler error).

Bixilon commented 1 month ago

Having the same issue, downgrading to 11.1 "fixes" this