chaoticgd / ghidra-emotionengine-reloaded

An extension for Ghidra that adds support for the PlayStation 2.
Apache License 2.0
118 stars 11 forks source link

`typedef`s are substituted for their underlying types upon importing STABS + bitfield issue #45

Closed abelbriggs1 closed 8 months ago

abelbriggs1 commented 8 months ago

First, thanks for all your work in maintaining CCC and this extension - both have been very helpful!

I'm currently analyzing a binary ([SCUS-97101] Twisted Metal: Black (NTSC)) which contains many debug symbols (functions, globals, C structs). There are quite a few C structs which are typedefd to be a different name. An example looks like:

typedef _hierstack HierStack;

struct _hierstack { // 0x30
    /* 0x00 */ FVECTOR3 eo;
    /* 0x0c */ HierHead **node;
    /* 0x10 */ uint16 matIdx;
    /* 0x12 */ uint16 eoCnt;
    /* 0x14 */ uint32 numKids;
    /* 0x18 */ uint32 *gsCtx;
    /* 0x1c */ float fade;
    /* 0x20 */ HierAnimCharInstance *animCharInst;
    /* 0x24 */ uint32 unused1;
    /* 0x28 */ uint32 unused2;
    /* 0x2c */ uint32 unused3;
};

typedef _hierstackinfo HierStackInfo;

struct _hierstackinfo { // 0x60
    /* 0x00 */ HierStack curStack;
    /* 0x30 */ HierStack *stack;
    /* 0x34 */ int stackIdx;
    /* 0x38 */ CS *gCs;
    /* 0x3c */ int32 fovFlag;
    /* 0x40 */ uint32 *lightDir;
    /* 0x44 */ uint32 texLastUsedCnt;
    /* 0x48 */ uint32 *texBinE;
    /* 0x4c */ int32 lastEoCnt;
    /* 0x50 */ int32 unused;
    /* 0x54 */ int32 matIdx;
};

However, when I import the binary into Ghidra and run the STABS analyzer on the binary, many of these typedefs are substituted for their underlying types, and the typedefs are completely lost (and not imported into Ghidra).

image

It seems like CCC's stdump is correctly parsing the typedefs, since stdump json SCUS_971.01 produces JSON with the typedefs included:

{
    "descriptor": "type_name",
    "name": "HierStack",
    "storage_class": "typedef",
    "stabs_type_number": 933,
    "files": [232, 309],
    "source": "cross_reference",
    "type_name": "_hierstack"
}

Would you happen to know what's causing this behavior and how it can be mitigated? I'm not particularly familiar with the code, but I can try to submit a patch if I can get a general idea of what might be happening.

Thanks!

chaoticgd commented 8 months ago

I was just thinking about this yesterday, odd. I never got around to implementing support for importing typedefs in the Ghidra importer, but I could certainly have a go now.

chaoticgd commented 8 months ago

I've implemented the feature.

Can you help me test it by trying out this unstable build? https://github.com/chaoticgd/ghidra-emotionengine-reloaded/releases/download/unstable/ghidra_10.4_PUBLIC_20231020_ghidra-emotionengine-reloaded.zip

chaoticgd commented 8 months ago

In case anyone is interested: The main reason I didn't implement support for typedefs originally is just that typedefs aren't a first-class citizen in STABS (or C/C++ ASTs for that matter), they're just a storage class modifier. In Ghidra typedefs actually have their own DataTypeImpl subclass.

abelbriggs1 commented 8 months ago

I've implemented the feature.

Can you help me test it by trying out this unstable build? https://github.com/chaoticgd/ghidra-emotionengine-reloaded/releases/download/unstable/ghidra_10.4_PUBLIC_20231020_ghidra-emotionengine-reloaded.zip

Thanks, this does indeed allow typedefs to be imported with no issue!

As a quick aside, while verifying that this worked, I noticed that bit fields also are not imported. I'm currently taking a stab at implementing that, and it seems to be going alright - I have basic bit fields like this importing without issue:

struct _hierhead { // 0x4
    /* 0x0:0 */ unsigned int opcode : 6;
    /* 0x0:6 */ unsigned int isRelocated : 1;
    /* 0x0:7 */ unsigned int id2 : 11;
    /* 0x2:2 */ unsigned int id1 : 14;
};

I'm only having some issues with a couple of very strange bitfields, where some members don't have the bitfield tag for some reason:

// warning: multiple differing types with the same name (#379, size not equal)
typedef union { // 0x8
    /* 0x0 */ FLO_type value;
    /* 0x0 */ fractype value_raw;
    /* 0x0 */ halffractype words[2];
    /* 0x0 */ struct { // 0x8
        /* 0x0 */ fractype fraction;
        /* 0x6:4 */ unsigned int exp : 11;
        /* 0x7:7 */ unsigned int sign : 1;
    } bits;
} FLO_union_type;
JSON dump ```json { "descriptor": "union", "name": "FLO_union_type", "storage_class": "typedef", "size_bits": 64, "conflict": true, "stabs_type_number": 38, "files": [ 379 ], "fields": [ { "descriptor": "type_name", "name": "value", "relative_offset_bytes": 0, "absolute_offset_bytes": 0, "size_bits": 64, "source": "reference", "type_name": "FLO_type", "referenced_file_index": 379, "referenced_stabs_type_number": 30 }, { "descriptor": "type_name", "name": "value_raw", "relative_offset_bytes": 0, "absolute_offset_bytes": 0, "size_bits": 64, "source": "reference", "type_name": "fractype", "referenced_file_index": 379, "referenced_stabs_type_number": 28 }, { "descriptor": "array", "name": "words", "relative_offset_bytes": 0, "absolute_offset_bytes": 0, "size_bits": 64, "element_type": { "descriptor": "type_name", "source": "reference", "type_name": "halffractype", "referenced_file_index": 379, "referenced_stabs_type_number": 29 }, "element_count": 2 }, { "descriptor": "struct", "name": "bits", "relative_offset_bytes": 0, "absolute_offset_bytes": 0, "size_bits": 64, "base_classes": [], "fields": [ { "descriptor": "type_name", "name": "fraction", "relative_offset_bytes": 0, "absolute_offset_bytes": 0, "size_bits": 52, "source": "reference", "type_name": "fractype", "referenced_file_index": 379, "referenced_stabs_type_number": 28 }, { "descriptor": "bitfield", "name": "exp", "relative_offset_bytes": 6, "absolute_offset_bytes": 6, "size_bits": 11, "bitfield_offset_bits": 4, "underlying_type": { "descriptor": "type_name", "source": "reference", "type_name": "unsigned int", "referenced_file_index": 379, "referenced_stabs_type_number": 4 } }, { "descriptor": "bitfield", "name": "sign", "relative_offset_bytes": 7, "absolute_offset_bytes": 7, "size_bits": 1, "bitfield_offset_bits": 7, "underlying_type": { "descriptor": "type_name", "source": "reference", "type_name": "unsigned int", "referenced_file_index": 379, "referenced_stabs_type_number": 4 } } ], "member_functions": [] } ], "member_functions": [] }, ```

But I don't think this is CCC or ghidra-emotionengine-reloaded's fault - more likely a bug in GCC which didn't set the bit field tag on those members.

chaoticgd commented 8 months ago

Glad to hear the initial issue has been resolved.

As for the bitfields, the problem is that STABS doesn't even have a "bit field tag"; it has to be inferred from the offset in bits/size in bits compared to that of the underlying type. It's not immediately clear to me if it would be possible to improve the heuristic ccc uses for detecting bitfields, but since it currently works in almost all cases I think it's perfectly fine to keep it as it is.

chaoticgd commented 8 months ago

If it's helpful the current logic for detecting bitfields is here: https://github.com/chaoticgd/ccc/blob/b873d5511d8c9bd3d093f74c55fd4455fe29bd1f/ccc/ast.cpp#L318

Since I know STABS syntax can be a bit hard to read if you're not familiar with it, I've also manually formatted the relevant stab strings if that's at all useful:

unsigned int:t4=r1;000000000000000000000000;000000000000037777777777;
USItype:t26=4
fractype:t28=26
FLO_union_type:t38=39=
    u4
        value:30,0,32;
        value_raw:28,0,32;
        bits:40=
            s4
                fraction:28,0,23;
                exp:4,23,8;
                sign:4,31,1;
            ;
            ,0,32;
    ;
chaoticgd commented 8 months ago

Okay I've figured out what the bug was. The bitfield code wasn't resolving stabs references properly. I'll probably put out a v1.3 release of stdump so we can include it with ghidra-emotionengine-reloaded releases shortly (edit: after this is all resolved, obviously).

chaoticgd commented 8 months ago

https://github.com/chaoticgd/ccc/pull/121

abelbriggs1 commented 8 months ago

Thanks - that fix to CCC did the trick!

I've opened a PR for bitfield import support - feel free to review at your leisure.

chaoticgd commented 8 months ago

PR merged.