fay59 / fcd

An optimizing decompiler
http://zneak.github.io/fcd
Other
701 stars 70 forks source link

simplifycfg pass eliminates entire main function implementation #27

Closed Lukas-Dresel closed 8 years ago

Lukas-Dresel commented 8 years ago

You can see in my fork of fcd in the branch pipelinetesting under pipeline/testing/test.c that i wrote a very simple, very small program that xors a user-given string with 0x42 and outputs it. Alongside it you can see the output of the print-module calls after every llvm pass completes. When comparing pipeline/testing/011_post_gvn.ll and 012_post_simplify_cfg.ll we can see that the entire implementation of the main function has been stripped away.

P.S. By the way, do you want me to directly link to the files in my fork or reference them like this?

Lukas-Dresel commented 8 years ago

This also happens when attempting to decompile the challenge at https://github.com/ctfs/write-ups-2016/tree/master/su-ctf-2016/reverse/dmd-50 with fcd -p -e 0x00400e8d dMd

It is again the simplifycfg pass that causes the main function implementation to vanish.

However this challenge also produces errors when attempting decompilation with normal fcd dMd, but that is a different issue.

Lukas-Dresel commented 8 years ago

In the llvm documentation i found the following part: http://llvm.org/docs/Passes.html#lint-statically-lint-checks-llvm-ir

There it says that the instcombine pass often turns instructions with undefined behavior into unreachable. This might be what causes this, as the block ends right when the first instruction with undef is encountered:

https://github.com/Lukas-Dresel/fcd/blob/pipelinetesting/pipeline/testing/011_post_gvn.ll#L234 https://github.com/Lukas-Dresel/fcd/blob/pipelinetesting/pipeline/testing/012_post_simplifycfg.ll#L203

Lukas-Dresel commented 8 years ago

This would then be related to issue #26 as the undefs could be also considered unreachable by simplifycfg

fay59 commented 8 years ago

I haven't looked very hard, but the simplest explanation is that the stack cookie is causing this.

Lukas-Dresel commented 8 years ago

Can confirm, when compiling with -fno-stack-protector it decompiles correctly. This is then in fact the same as #26 where the segment registers break it.

fay59 commented 8 years ago

951f05a solves the segment problem by using special functions to access pointers into the fs and gs, like __fs_ptr_64(40) to read at fs:[40]. I'm closing this issue as that solves this specific problem, but there are other issues that prevent the examples that you've listed from decompiling (namely, with stack recovery and CFG recovery).

The changes are relatively small. If you're interested in how the emulator interacts with code generation, that's a simple one to look at.

fay59 commented 8 years ago

Another possible solution would be to co-opt LLVM's address space support, and assign an address space to each segment.

fay59 commented 7 years ago

If this is of any interest, stack frame recovery still fails on the executable, but this one can be turned off, so fcd can produce output for the program. Type inference will be my next target once the whole structurization thing is over.