GregoryMorse / GhidraDec

Ghidra Decompiler Plugin for IDA Pro
MIT License
161 stars 29 forks source link

Infinite recursion in plugin force closes IDA #5

Open biggestsonicfan opened 2 years ago

biggestsonicfan commented 2 years ago

Greetings,

As I'm still getting the hang of using IDA and Ghidra together like this, it appears that certain functions I am sending to the IDA decompiler are force closing IDA. A dumpfile of 0 bytes is created, however, but that's not of much use.

The functions can consist of a lot of calls and branches, and perhaps trying to gather all the necessary information in IDA to transfer to Ghidra is causing an issue. Loading the binary in Ghidra itself seemingly allows the decompile of the same address function to work just fine, so I'm not sure how to diagnose this issue.

GregoryMorse commented 2 years ago

There is likely a bug in the interface. The protocol is complicated enough as well as IDAs database information to make the whole thing non-trivial.

Usually the best option is to 1) Open the project with Visual Studio 2019 and make a Debug build. 2) Copy the debug build to the IDA folder. 3) Run IDA and open the database which has Ghidra configured. 4) Attach to process in Visual Studio, by choosing IDAs process idag.exe. 5) Hit F5 to run the decompilation and some exception which is not gracefully handled will come back which should give a useful stack trace and exact location in the code, and some intuition about what is happening. Some unexpected XML field from Ghidra or what have you is common.

If you have a publicly available sample to reproduce the issue and I have some time I can also look into it.

biggestsonicfan commented 2 years ago

I don't have the native IDA decompiler, so I use Ctrl+G here instead of F5.

Doing all this yielded a not so fruitful result: Unhandled exception at 0x00007FFBB83E95A0 (dbgcore.dll) in ida.exe: 0xC0000005: Access violation writing location 0x00000005FA200FD0.

I'm going to begin testing conditions with a new database built from the same binary but with no pre-existing IDA information to see if the problem is still reproducible. The rabbit hole could get quite deep if it involves something in my already existing IDA database but if the problem can't be reproduced from a clean start it will be much harder to diagnose.

EDIT: As seemingly expected, I can't reproduce the problem with a fresh analysis on the binary. Worse yet, IDA seems to crash on disassembling certain addresses where my previous version (7.2) were fine. This may take more investigation on my end until I can reproduce the issue from a fresh start. Once I can get there, I will get back to this issue. If I eventually can't, I will close this issue when it becomes stale as a non-issue.

GregoryMorse commented 2 years ago

Sorry I meant Ctrl+G. The exception is not helpful, but did you get a stacktrace? The stacktrace is the key. Its probably a bug you found, some error or protocol issue not being gracefully handled. If we knew the line of code, it would be easy to fix. It is strange that it doesn't reproduce though. Certainly the code is not perfect, but it does work well in at least a reasonably large amount of cases, which is incredibly useful from within IDA rather than having to deal with Ghidra's Java UI. If its an IDA version issue that is also something to note.

biggestsonicfan commented 2 years ago

Ah yes, terribly sorry. I was using msys2 with cmake to build (Visual Studio gives errors for some reason) and forgot the install part does not copy the .pdb and .lib files. My mistake.

The break seems to happen here: https://github.com/GregoryMorse/GhidraDec/blob/1ff85b326c624f49c30d2d037814fadb001275fd/decompiler.cpp#L827

And offset in the Autos window seems to be ridiculously high at 14757395258967641292

As I am not entirely familiar with stack trace logs in Visual Studio, a bit of googling allows me to suspect it's the copy of the displayed output in the Call Stack window: stacktrace.log

GregoryMorse commented 2 years ago

That offset is >>> hex(14757395258967641292) '0xcccccccccccccccc'

All 'c's is usually in a debug build for uninitialized memory. But its an infinite recursion that is causing the issues.

Line 900 -> 976 -> 900 -> 976 -> etc. Its a bug due to offset 0 occurring when it was chasing a pointer chain.

GregoryMorse commented 2 years ago

Line 976 should change (!checkPointer(offset, typeChain, deps)) to pass an extra parameter, bool dataChecked. Then on line 900 it should do if (!dataChecked) lookupData()...

Such an infinite recursion should not be possible. Apparently offset 0 is a valid pointer and has some type of fixup but Ive never seen or handled this case before. But regardless infinite recursion must be fixed. Thanks for the report.

biggestsonicfan commented 7 months ago

So here's the thing, offset 0 is used in my IDA disassembly because I output as an ASM file and am able to refine it a bit to something that can be reassembled with gnu's assembler. The pointer is needed to differentiate MEMA and MEMB instruction formats for the assembler, however, it isn't needed just to browse the disassembly. I can simply change the pointer to the literal value 0 and this should not cause the infinite recursion.