decomp / exp

Throw-away prototypes for experimental tools and libraries related to decompilation.
9 stars 3 forks source link

cmd/bin2ll: Figure out how to represent function stack frames #1

Open mewmew opened 7 years ago

mewmew commented 7 years ago

To support instructions such as push and pop, and stack relative memory references, we need to figure out how to represent function stack frames when lifting machine code to LLVM IR.

This concept is sometimes called a shadow stack.

Note, we may decide to go in one of several directions, either mimic as close to hardware as possible (i.e. update stack pointers and base pointer and use those to reference into a byte array representing memory, which may either be global to the program, or local to the function), or lift to a higher representation from the start (similar to how instruction intrinsics are lifted). With a type recovery oracle (i.e. perfect type analysis) it should be possible to represent each relative memory reference as the reference to that specific local variable of a given type, rather than a reference into a byte array representing the stack.

One of the objectives of bin2ll is to facilitate optimizations, and as such generally prefers local as compared to global representations for CPU registers, stack references, etc, even if they require more work from the lifter.

Several approaches will have to be evaluated. Anyone with ideas on the topic, feel free to join the discussions.

mewmew commented 7 years ago

/cc @7i

We started discussing how to represent stack frames today at our fika. If you have any further ideas not yet documented above, feel free to add them here so we may also invite others to join the discussion and brain storming.

Have a most lovely hike in the mountains!

kram /u