trailofbits / binrec-tob

BinRec: Dynamic Binary Lifting and Recompilation
Other
122 stars 16 forks source link

Recovered binaries init_array is incorrect #23

Open hbrodin opened 2 years ago

hbrodin commented 2 years ago

In the Henrik/s2esubmodule branch, when recovering binaries having entries in the init_array, e.g. usage of std::cout, these entries are lost somewhere in the translation.

The function to be called seems to be present in recovered.ll but the address to it is not added to the init_array section.

This issue is the primary reason why most C++ samples fail, including simple samples.

hbrodin commented 2 years ago

The effect of this is that binaries segfault when run.

hbrodin commented 2 years ago

One thing to check before further investigating this is if it works in the original one. I've used the fibpp binary to test.

hbrodin commented 2 years ago

recovered.ll.txt This file is the recovered fibpp-binary. The address of the function Func__ZN12_GLOBAL__N_13fibEi is what is supposed to go into the init_array. But the final binary 'recovered' does not have the address of this function in the init_array.

It is not clear to me how it would end up there. So far I haven't found any code that handles this. There might be something in the linker script but I don't know...

ameily commented 2 years ago

From peter:

by constructors do you mean things in .init_array type of thing?

you'll want to lift those out and put those into the llvm.ctors array

in mcsema, we do that, but then we also have an "early init" thing

where I think we do this: we make a function that does some random mcsema runtime initialization, then manually calls all the original ctors from the binary

and we put the mcsema function into the llvm.ctors array

this was because we had some random init to do, though

as in, it wasn't safe to enter a lifted function if this initialization hadn't been done

michaelbrownuc commented 2 years ago

A partial solution to this was provided in binrec-uci.