Closed mahaloz closed 1 month ago
Discovered a segfault in IDA when I run this code:
for ord_num in range(ida_typeinf.get_ordinal_qty(idati)):
tif = ida_typeinf.tinfo_t()
success = tif.get_numbered_type(idati, ord_num)
if not success:
continue
if not tif.is_typeref():
continue
name = tif.get_type_name()
type_name = tif.get_next_type_name()
if not name:
continue
if type_name is None:
# try again, but with ordinal
backup_tif = ida_typeinf.tinfo_t()
backup_tif.get_named_type(idaapi.get_idati(), name, ida_typeinf.BTF_TYPEDEF, True, True)
real_type_val = backup_tif.get_realtype()
try:
real_type = ida_typeinf.tinfo_t(real_type_val)
except Exception:
continue
type_name = str(real_type)
if not name or not type_name or name == type_name:
continue
All I'm actually trying to do is grab the name of the base type a typedef is referencing. For instance typedef int my_int
should return int
.
Maybe @arizvisa knows the real solution or why this segfaults? Am I not allowed to just do this. Tested on binary https://github.com/binsync/libbs/blob/main/tests/binaries/fauxware
for reference tif.get_next_type_name()
almost always is empty.
Discovered a segfault in IDA when I run this code:
This is crashing because typedefs are actually complex types. Complex types aren't a basic type, instead they're represented by a bytestream following the conditions at https://hex-rays.com//products/ida/support/sdkdoc/group__tf__complex.html#ga86c5e589737e005ba4741423fd2ca5c6.
# try again, but with ordinal backup_tif = ida_typeinf.tinfo_t() backup_tif.get_named_type(idaapi.get_idati(), name, ida_typeinf.BTF_TYPEDEF, True, True) real_type_val = backup_tif.get_realtype() try: real_type = ida_typeinf.tinfo_t(real_type_val) except Exception: continue type_name = str(real_type)
So, here you're creating a BTF_TYPEDEF
, getting its "realtype" in real_type_val
(which is always going to be BT_COMPLEX
), and then constructing a tinfo_t
with the "real type" result. However, it's relevant to note that The byte for each type is actually a combination of TYPE_MODIF_MASK(0xC0)
, TYPE_FLAGS_MASK(0x30)
, and TYPE_BASE_MASK(0x0F)
. So, the issue is that tinfo_t.get_realtype
only returns the TYPE_BASE_MASK
, or the lower 4 bits of the type, and completely excludes the flags and modifier bits. When you construct real_type
from real_type_val
, you're also excluding the rest of the information (the actual reference) that composes backup_tif
.
Rendering real_type
to a string causes IDA to attempt to decode its first byte, sees it's BT_COMPLEX
, and then expects another byte defining what "kind" of complex type it is, and then decoding OOB (pretty sure, anyways). The truth behind tinfo_t
is really exposed via the tinfo_t.serialize()
method. Typerefs are literally BT_COMPLEX|BTMT_TYPEDEF
, followed by a byte representing whether to use a string or ordinal, then followed by the actual name/ordinal. This is the difference between a "named type" and a "numbered type".
for reference tif.get_next_type_name() almost always is empty.
What you actually need to do to get the target of a typeref is to distinguish whether the type is a named or ordinal (numbered) type. Once you have that, then you use get_type_name
or get_ordinal
(respectively) to get the target type. Then use that result to query the type library via get_named_type
or get_numbered_type
(respectively).
In later versions of IDA, you can use replace_ordinal_typerefs
to always convert types to a named type..with the downside being that looking into the typed library by name is slower (algorithmically) than by ordinal. I'm using https://github.com/arizvisa/ida-minsc/blob/persistence-refactor/misc/interface.py#L5428 to always get an ordinal irregardless of the type being numbered or named.
All I'm actually trying to do is grab the name of the base type a typedef is referencing. For instance typedef int my_int should return int.
Since your example is always using get_numbered_type
, you can assume that you'll always be dealing with a numbered type and you'll only need to use tinfo_t.get_ordinal
or tinfo_t.get_final_ordinal
to figure out what type to look up next in the type library (https://github.com/arizvisa/ida-minsc/blob/persistence-refactor/misc/interface.py#L6185).
# try again, but with ordinal backup_tif = ida_typeinf.tinfo_t() backup_tif.get_named_type(idaapi.get_idati(), name, ida_typeinf.BTF_TYPEDEF,
Also, it's worth checking the result of get_numbered_type
and get_named_type
. They return booleans that tell you whether they've successfully modified the tinfo_t
you give it. If you have a tinfo_t
that you're unsure of (when serializing/deserializing them) I've been using tinfo_t.get_size
as a smoke-test to ensure that the type isn't damaged (tinfo_t.get_size() != BADSIZE
). Maybe there's a proper way to do this, but this way doesn't seem to crash for me.
What you actually need to do to get the target of a typeref is to distinguish whether the type is a named or ordinal
That was actually why I was trying to use just the normal type construction of IDA with the info_t, but yeah that segfaults. Is there another way, given a tinfo_t
, to find out if it is an ordinal type?
What you actually need to do to get the target of a typeref is to distinguish whether the type is a named or ordinal
That was actually why I was trying to use just the normal type construction of IDA with the info_t, but yeah that segfaults. Is there another way, given a
tinfo_t
, to find out if it is an ordinal type?
Maybe you can check if tinfo_t.get_ordinal
returns > 0, and then confirm that it's within bounds of get_ordinal_qty
?
Maybe you can check if tinfo_t.get_ordinal
That gives the original of the typedef type, but not the type it points too. I haven't yet found a way to get a reference to the base type without get_realtype
Ah, my bad.
Python>db.types.add('fuck', 'typedef int fuck')
fuck
# listing to make sure it was actually added.
Python>db.types.list(index=range(idaapi.get_ordinal_qty(None)-0x5, 0x100))
[82] +0x10 : L--S : $::$41B0E947727B04B281BBEA4B1896A8BE::$4B29161E04CAD4BCDD788B201A5E8E5E : struct {void *_call_addr;int _syscall;unsigned int _arch;}
[83] ? : ?T-- : std::nothrow_t : struct
[84] ? : ?T-- : std::__detail::_Prime_rehash_policy : struct
[85] ? : ?T-- : std::exception : struct
[86] +0x4 : L-I- : fuck : int
Python>db.types.by('fuck')
fuck
Python>type(db.types.by('fuck'))
<class 'ida_typeinf.tinfo_t'>
# this just gets the ordinal of the type that was added
Python>db.types.ordinal('fuck')
0x56
# this is the goal
Python>db.types.get(0x56)
int
## now back to ida-speak
# serialize type out of the type library
Python>ti=idaapi.tinfo_t()
Python>idaapi.get_numbered_type(None, 0x56)
(b'\x07', None, None, None, 0x0)
# deserialize it back into a type (the output from get_numbered_type/get_named_type may have different types than tinfo_t.deserialize)
Python>ti.deserialize(idaapi.get_idati(), b'\x07', None or b'', None or b'')
True
Python>ti
int
So you need to deserialize your tinfo_t
from whatever get_numbered_type/get_named_type returns.
(edited) ...and for get_named_type.
Python>idaapi.get_named_type(None, 'fuck', idaapi.NTF_TYPE)
(0x1, b'\x07', None, None, None, 0x0, 0x56)
Python>ti.deserialize(idaapi.get_idati(), b'\x07', b'', b'')
True
Python>ti
int
(edited again)
using tinfo_t.get_ordinal
to get the input for get_numbred_type
Python>ti = idaapi.tinfo_t()
Python>ti.get_numbered_type(None, 0x56)
True
Python>ti
fuck
Flushing buffers, please wait...ok
Python>ti.get_ordinal()
0x56
using tinfo_t.get_name
to get the input for get_named_type
.
Python>ti = idaapi.tinfo_t()
Python>ti.get_named_type(None, 'fuck')
True
Python>ti
fuck
Python>ti.get_type_name()
'fuck'
TODO
Add angr support(DNE!)Binja Incomplete
Until the problems associated with https://github.com/Vector35/binaryninja-api/issues/4552 are fixed, typedefs that reference primitive types wont work.