NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
49.09k stars 5.65k forks source link

How to manually type the vtables? #6577

Closed Nemoumbra closed 1 month ago

Nemoumbra commented 1 month ago

My case is (simplified and abstracted) as follows... The source code likely declares a base class and 2 derived classes in the following manner:

class Base {
    virtual ~Base() {};
    virtual void impl(/* args */) = 0;
};

class FirstChild: public Base {
    virtual ~FirstChild() {};
    virtual void impl(/* args */) {
        this->func(/* args */);
    }
    virtual void func(/* args */) { ... }

    /* fields */
};
class SecondChild: public Base {
    virtual ~SecondChild() {};
    virtual void impl(/* args */) {
        this->func(/* args */);
    }
    virtual void another() { ... }
    virtual void func(/* args */) { ... }

    /* fields */
};

The vtables are (I trimmed the RTTI because it's unavailable):

-----------------------------------------------------------------------------------
0x0:  Base::~Base;
0x4:  nullptr; // pure virtual function with no implementation
-----------------------------------------------------------------------------------
0x0:  FirstChild::~FirstChild;
0x4:  FirstChild::impl; // derived class introduces the implementation
0x8:  FirstChild::func;
-----------------------------------------------------------------------------------
0x0:  SecondChild::~SecondChild;
0x4:  SecondChild::impl; // derived class introduces the implementation
0x8:  SecondChild::another;
0x10: SecondChild::func;
-----------------------------------------------------------------------------------

Next... I've got a struct Holder that contains a pointer to some instance of the derived class. Like, it is sometimes assigned a FirstChild's address and sometimes it's the SecondChild's.

Now the questions...


1) I wanted to type the field as Base*. Then Base would be

struct Base {
    BaseVtable* vtable;
};

The question is... what's BaseVtable then? I guess, it should look like this:

struct BaseVtable {
    ??? dtor;
    ??? impl;
};

...but I don't know what types to use for the entries. I could use void*, but that erases the info about the arguments => theoretically Ghidra might show

(**(code**)holder->instance->vtable->impl)();

instead of

(**(code**)holder->instance->vtable->impl)(holder->instance, /* args */);

The latter is definitely better => how to type the entries?


2) Even if we manage to propagate the types into the BaseVtable, what are supposed to do with the derived vtbales?

struct FirstChild {
    FirstChildVtable* vtable;
    /* fields */
};

... and then

struct FirstChildVtable {
    ??? dtor;
    ??? impl;
    ??? func;
}

Assuming we don't use void* for the entries, it looks like there is just no way to type that... Because we can no longer do this:

void FirstChild_impl(FirstChild* this, /* args */);

We are forced to do

void FirstChild_impl(Base* this, /* args */);

... but that doesn't work, because the BaseVtable doesn't have the entry for FirstChild::func

Maybe I'm just confused and I must use void* for the functions, tell me if that's so, please.

EmosewaMC commented 1 month ago

have you tried using a union of vftables and then manually selecting the vftable type based on the target? that seems to be what you are trying to go for here?

BhaaLseN commented 1 month ago

You might also want to take a look at #516 which deals with the general topic of vtables. Maybe you can find some inspiration over there.