redballoonsecurity / ofrak

OFRAK: unpack, modify, and repack binaries.
https://ofrak.com
Other
1.85k stars 127 forks source link

angr disassembly backend creates different Basic Blocks than other backends #308

Open EdwardLarson opened 1 year ago

EdwardLarson commented 1 year ago

What is the problem? (Here is where you provide a complete Traceback.) angr's CFG is a little different than other backends, because it considers function calls to split basic blocks: https://docs.angr.io/en/latest/appendix/faq.html#why-is-angr-s-cfg-different-from-ida-s

(WRT to the second bullet point at that link: OFRAK does pass the normalize=True option to angr by default)

The link mentions IDA specifically, but Ghidra and Binary Ninja also don't break basic blocks at function calls. The result is that angr sometimes unpacks different basic blocks for a function, compared to the ofrak_ghidra or ofrak_binary_ninja backends.

Please provide some information about your environment. At minimum we would like the following information on your platform and Python environment:

Happens on any environment.

If you've discovered it, what is the root cause of the problem?

The difference between angr analysis and how other backends typically analyze basic blocks.

How often does the issue happen?

What are the steps to reproduce the issue? Ideally, give us a short script that reproduces the issue.

How would you implement this fix? The AngrComplexBlockUnpacker could check if the basic blocks it gets from angr end with a function call and, if they do, combine that basic block with the next one. One complication to this (possibly more) is that in some cases, there may actually be two basic blocks as other backends would define them, and it just so happens that one ends with a call instruction. That is, another basic block might jump to the instruction right after the call instruction. For full consistency, we would want to check for that case before merging two basic blocks.

Are there any (reasonable) alternative approaches?

Are you interested in implementing it yourself?