binsync / libbs

A library for writing plugins in any decompiler: includes API lifting, common data formatting, and GUI abstraction!
BSD 2-Clause "Simplified" License
63 stars 4 forks source link

Feat: Support Typedefs #98

Closed mahaloz closed 1 month ago

mahaloz commented 1 month ago


Binja Incomplete

Until the problems associated with are fixed, typedefs that reference primitive types wont work.

mahaloz commented 1 month ago

Discovered a segfault in IDA when I run this code:

for ord_num in range(ida_typeinf.get_ordinal_qty(idati)):
    tif = ida_typeinf.tinfo_t()
    success = tif.get_numbered_type(idati, ord_num)
    if not success:

    if not tif.is_typeref():

    name = tif.get_type_name()
    type_name = tif.get_next_type_name()
    if not name:

    if type_name is None:
        # try again, but with ordinal
        backup_tif = ida_typeinf.tinfo_t()
        backup_tif.get_named_type(idaapi.get_idati(), name, ida_typeinf.BTF_TYPEDEF, True, True)
        real_type_val = backup_tif.get_realtype()
            real_type = ida_typeinf.tinfo_t(real_type_val)
        except Exception:

        type_name = str(real_type)

    if not name or not type_name or name == type_name:

All I'm actually trying to do is grab the name of the base type a typedef is referencing. For instance typedef int my_int should return int.

Maybe @arizvisa knows the real solution or why this segfaults? Am I not allowed to just do this. Tested on binary

for reference tif.get_next_type_name() almost always is empty.

arizvisa commented 1 month ago

Discovered a segfault in IDA when I run this code:

This is crashing because typedefs are actually complex types. Complex types aren't a basic type, instead they're represented by a bytestream following the conditions at

        # try again, but with ordinal
        backup_tif = ida_typeinf.tinfo_t()
        backup_tif.get_named_type(idaapi.get_idati(), name, ida_typeinf.BTF_TYPEDEF, True, True)
        real_type_val = backup_tif.get_realtype()
            real_type = ida_typeinf.tinfo_t(real_type_val)
        except Exception:

        type_name = str(real_type)

So, here you're creating a BTF_TYPEDEF, getting its "realtype" in real_type_val (which is always going to be BT_COMPLEX), and then constructing a tinfo_t with the "real type" result. However, it's relevant to note that The byte for each type is actually a combination of TYPE_MODIF_MASK(0xC0), TYPE_FLAGS_MASK(0x30), and TYPE_BASE_MASK(0x0F). So, the issue is that tinfo_t.get_realtype only returns the TYPE_BASE_MASK, or the lower 4 bits of the type, and completely excludes the flags and modifier bits. When you construct real_type from real_type_val, you're also excluding the rest of the information (the actual reference) that composes backup_tif.

Rendering real_type to a string causes IDA to attempt to decode its first byte, sees it's BT_COMPLEX, and then expects another byte defining what "kind" of complex type it is, and then decoding OOB (pretty sure, anyways). The truth behind tinfo_t is really exposed via the tinfo_t.serialize() method. Typerefs are literally BT_COMPLEX|BTMT_TYPEDEF, followed by a byte representing whether to use a string or ordinal, then followed by the actual name/ordinal. This is the difference between a "named type" and a "numbered type".

for reference tif.get_next_type_name() almost always is empty.

What you actually need to do to get the target of a typeref is to distinguish whether the type is a named or ordinal (numbered) type. Once you have that, then you use get_type_name or get_ordinal (respectively) to get the target type. Then use that result to query the type library via get_named_type or get_numbered_type (respectively).

In later versions of IDA, you can use replace_ordinal_typerefs to always convert types to a named type..with the downside being that looking into the typed library by name is slower (algorithmically) than by ordinal. I'm using to always get an ordinal irregardless of the type being numbered or named.

All I'm actually trying to do is grab the name of the base type a typedef is referencing. For instance typedef int my_int should return int.

Since your example is always using get_numbered_type, you can assume that you'll always be dealing with a numbered type and you'll only need to use tinfo_t.get_ordinal or tinfo_t.get_final_ordinal to figure out what type to look up next in the type library (

        # try again, but with ordinal
        backup_tif = ida_typeinf.tinfo_t()
        backup_tif.get_named_type(idaapi.get_idati(), name, ida_typeinf.BTF_TYPEDEF, 

Also, it's worth checking the result of get_numbered_type and get_named_type. They return booleans that tell you whether they've successfully modified the tinfo_t you give it. If you have a tinfo_t that you're unsure of (when serializing/deserializing them) I've been using tinfo_t.get_size as a smoke-test to ensure that the type isn't damaged (tinfo_t.get_size() != BADSIZE). Maybe there's a proper way to do this, but this way doesn't seem to crash for me.

mahaloz commented 1 month ago

What you actually need to do to get the target of a typeref is to distinguish whether the type is a named or ordinal

That was actually why I was trying to use just the normal type construction of IDA with the info_t, but yeah that segfaults. Is there another way, given a tinfo_t, to find out if it is an ordinal type?

arizvisa commented 1 month ago

What you actually need to do to get the target of a typeref is to distinguish whether the type is a named or ordinal

That was actually why I was trying to use just the normal type construction of IDA with the info_t, but yeah that segfaults. Is there another way, given a tinfo_t, to find out if it is an ordinal type?

Maybe you can check if tinfo_t.get_ordinal returns > 0, and then confirm that it's within bounds of get_ordinal_qty?

mahaloz commented 1 month ago

Maybe you can check if tinfo_t.get_ordinal

That gives the original of the typedef type, but not the type it points too. I haven't yet found a way to get a reference to the base type without get_realtype

arizvisa commented 1 month ago

Ah, my bad.

Python>db.types.add('fuck', 'typedef int fuck')

# listing to make sure it was actually added.
Python>db.types.list(index=range(idaapi.get_ordinal_qty(None)-0x5, 0x100))
[82] +0x10 : L--S : $::$41B0E947727B04B281BBEA4B1896A8BE::$4B29161E04CAD4BCDD788B201A5E8E5E : struct {void *_call_addr;int _syscall;unsigned int _arch;}
[83]     ? : ?T-- : std::nothrow_t                                                          : struct 
[84]     ? : ?T-- : std::__detail::_Prime_rehash_policy                                     : struct 
[85]     ? : ?T-- : std::exception                                                          : struct 
[86]  +0x4 : L-I- : fuck                                                                    : int

<class 'ida_typeinf.tinfo_t'>

# this just gets the ordinal of the type that was added

# this is the goal

## now back to ida-speak

# serialize type out of the type library
Python>idaapi.get_numbered_type(None, 0x56)
(b'\x07', None, None, None, 0x0)

# deserialize it back into a type (the output from get_numbered_type/get_named_type may have different types than tinfo_t.deserialize)
Python>ti.deserialize(idaapi.get_idati(), b'\x07', None or b'', None or b'')


So you need to deserialize your tinfo_t from whatever get_numbered_type/get_named_type returns.

(edited) ...and for get_named_type.

Python>idaapi.get_named_type(None, 'fuck', idaapi.NTF_TYPE)
(0x1, b'\x07', None, None, None, 0x0, 0x56)

Python>ti.deserialize(idaapi.get_idati(), b'\x07', b'', b'')

(edited again) using tinfo_t.get_ordinal to get the input for get_numbred_type

Python>ti = idaapi.tinfo_t()
Python>ti.get_numbered_type(None, 0x56)

Flushing buffers, please wait...ok

using tinfo_t.get_name to get the input for get_named_type.

Python>ti = idaapi.tinfo_t()
Python>ti.get_named_type(None, 'fuck')
