NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
50.97k stars 5.81k forks source link

Timeout doesn't work when infinite loop happens during auto-analyze #4296

Closed noamzhitomirsky closed 2 months ago

noamzhitomirsky commented 2 years ago

Describe the bug Timeout doesn't work when infinite loop happens during auto-analyze process of specific binary. This happens in headless mode and GUI mode. In headless mode the timeout option didn't stop the process. I think that the problem is in "Non-Returning Functions - Discovered" analyzer because when I turned it off everything worked fine. Unfortunately I can't share the problematic binary with you but probably the timeout issue can be solved without the binary. To Reproduce Steps to reproduce the behaviour:

  1. Run: analyzeHeadless -import <some binary that can cause infinite loop in "Non-Returning Functions - Discovered" analyzer> -analysisTimeoutPerFile 300
  2. wait 5 minutes, see that analyzer still working. It will work till you kill the process.
  3. The last log line: "INFO ANALYZING all memory and code: (HeadlessAnalyzer)" Expected behaviour Analysis will stop at the time specified in timeout. I understand that solve the bug which causes to the infinite loop may be complicated without the binary. Environment (please complete the following information):
astrelsky commented 2 years ago

In my experience you are much better off always disabling that analyzer. It does way more harm than good and usually ends up marking 90% of functions "no return". (That may be a bit over exaggerated)

emteere commented 2 years ago

If you pull the file up into the ghidra GUI, does the same thing occur? Can you cancel that "Non-returning Functions - Discovered" after 6 minutes in Ghidra? If you let it run does it ever finish? If you turn the analyzer off and then let it analyze, and then run the FixupNoReturnFunctions script, does it find alot of non-returning functions? Are there other flow issues, bad load addres, import issues, bad relocations? Is this possibly an ARM binary that uses BL, normally a call, to do long jumps?

Regardless we should fix the issue so it can cancel with a timeout. If you can get it to repeat, and then run:

As @astrelsky noted sometimes once a non-returning function has been discovered, it may need to fix alot of code. The analysis tries to do this earlier than later so there isn't alot of damage to fix taking time. If there are other flow issues in the code it can get tripped up, especially on certain binaries. However on binaries that have non-returning function that are not known/labeled, the code will be a mess if you don't use it. Code not disassembled, bad flow, bad switch recovery, one routine that falls into another til the end of code because of exceptions, etc...

As an initial triage if you are running on malformed binaries, especially on malware binaries that are know to have bad flow, it could be turned off.

emteere commented 2 years ago

Any luck on duplicating or trying to above suggestion? If not we'll close the issue soon.

homes410 commented 2 years ago

@noamzhitomirsky I think you can reproduce the issue with compiling open-source project by ollvm from the source code with ollvm(obfuscator-llvm) flags applied. And then, You can post the attachment for this issue without triggering a DMCA complaint notice. If you have a working ollvm-resistant disassembler as I do, I'm sure that you don't have to deal with this issue anymore.