avast / retdec

RetDec is a retargetable machine-code decompiler based on LLVM.
https://retdec.com/
MIT License
7.99k stars 944 forks source link

vtable_finder cannot detect vtables for abstract classes #616

Open astrelsky opened 5 years ago

astrelsky commented 5 years ago

Was just looking through some things and came across the comment on line 112. I happen to know from doing research to build an gnu rtti analyzer from ghidra and decided to run a quick test.

The issue is due to the following in vtable_finder.cpp line 112:

// All items in vtable must be unique (really???).
//
if (items.find(ptr) != items.end())
{
    LOG << "\t\t\t" << a << " @ !unique" << std::endl;
    return false;
}

Compiling the following will recreate the issue. There is poor naming here, it's a snippit from junk code I was using to figure out how virtual bases were handling in class construction so I could determine how to reconstruct them. If all symbols and dwarf information is left in the binary, checking the json output shows that the vtable was not detected. Running through the symbols, _ZTV1F is indeed present.

class F {
  public:
    virtual void f_foo() = 0;
    virtual void abstract_1() = 0;
    virtual void abstract_2() = 0;
    virtual void abstract_3() = 0;
    ptrdiff_t offset_of(int &data) { return abs((ptrdiff_t)this - (ptrdiff_t)&data); }

    int f_data;
};

class G : virtual public F {
  public:
    virtual void f_foo() {}
    virtual void abstract_1() {}
    virtual void abstract_2() {}
    virtual void abstract_3() {}
    template <typename T>
    ptrdiff_t offset_of(int &data) { return abs((ptrdiff_t)this - (ptrdiff_t)&data); }

    int g_data;
};

With regards to the comment on uniqueness, that is actually not the case. There are some things that can be duplicated. The only function pointer that can be duplicated is the pointer to cxxabiv1::cxa_pure_virtual. Null pointers may exist within the first two slots of the function pointer array and it is more common with newer builds of gcc. In construction vtables I have seen the offset_to_top == offset_to_base. It can only occur for a virtual base so the ptrdiff_t value must be less than 0.

s3rvac commented 5 years ago

@PeterMatula Can you please take a look?

astrelsky commented 5 years ago

I put up a repository of the code I use for testing along with a release of binaries for different architectures. It can be found here