binsync / libbs

A library for writing plugins in any decompiler: includes API lifting, common data formatting, and GUI abstraction!
BSD 2-Clause "Simplified" License
63 stars 4 forks source link

Types setting/getting/callbacks no longer work in IDA 9 #62

Open mahaloz opened 4 months ago

mahaloz commented 4 months ago

In version 9 of IDA Pro, they changed how structs and enums work in the backend. All of them now use a centralized system called "local types." They also literally removed ida_structs as an importable module.

This is a problem for three reasons:

  1. LocalTypes have always existed in the hooks, but we don't hook them. Now we have to.
  2. How we previously got structs and enums is broken. The old APIs return $324235235..., which will get dropped by BinSync.
  3. Backwards compatibility is no longer possible in a sane way for versions below 9. Once we switch to 9, we need to drop support for all lesser versions.

This has been partially addressed in #101 for at least listing support of Structs in version 9.

EDIT: updated issue to be targetted at IDA 9.

arizvisa commented 1 month ago

Something to keep in mind (for a timeline) is that the old API is supposed to be deprecated in the next major version according to support@. They also plan on updating the frame API to use local types too, if I'm understanding them correctly. Somewhat enraging...

mahaloz commented 1 month ago

Oh I am well aware... @arizvisa lol. Yeah it's not ideal.

They also plan on updating the frame API to use local types too

Wait, like stack variables?

arizvisa commented 1 month ago

Wait, like stack variables?

Yessir.

(quoted directly from email)

We're planning to remove the legacy API (including for the stack frame, which is the last holdout) and transition to the tinfo_t based API in the next major release of IDA. Please try it and let us know if there are any use scenarios we've missed (I'm afraid I didn't quite get your long explanation... )

I've been incrementally tracking all the things that'll be broken in my own project at https://github.com/arizvisa/ida-minsc/discussions/158#discussioncomment-10091588 as a result of it. Literally all the interval and structure/union/frame arithmetic (things that I use every day for laying out structs and frames), slice assignments, struct and member xrefs (recently refactored), and serialization/deserialization between databases (should be using libbs for this, tho) will likely remain broken in my plugin until some point after the next major version.

It is what it is, though. Just kind of praying the hooks will be remotely similar to the legacy hooks so that I can still track things without being too crazy. It'd just suck to have to process the entirety of a tinfo_t struct/union/enum every time one of its member names/types are changed or commented. Heh.

mahaloz commented 1 month ago

Great... welp. Looks like binsync is going to be broken during that time period as well until I port everything. It is already not working perfectly anymore in 8.4, so I'll likely have to do all these changes before 9.0 drops. However, it also sucks since it means older versions of IDA will be left behind.

arizvisa commented 1 month ago

"Creation — that is the great salvation from suffering, and life's alleviation. But for the creator to appear, suffering itself is needed, and much change."

arizvisa commented 1 month ago

(from leaked 9.0 beta sdk)

new frame events:

+    frame_created,          ///< A function frame has been created.
+                            ///< \param func_ea (::ea_t)
+                            ///< \ref idb_event::frame_deleted
+
+    frame_udm_created,      ///< Frame member has been added.
+                            ///< \param func_ea (::ea_t)
+                            ///< \param udm     (::const udm_t *)
+
+    frame_udm_deleted,      ///< Frame member has been deleted.
+                            ///< \param func_ea (::ea_t)
+                            ///< \param udm_tid (::tid_t)
+                            ///< \param udm     (::const udm_t *)
+
+    frame_udm_renamed,      ///< Frame member has been renamed.
+                            ///< \param func_ea (::ea_t)
+                            ///< \param udm     (::const udm_t *)
+                            ///< \param oldname (::const char *)
+
+    frame_udm_changed,      ///< Frame member has been changed.
+                            ///< \param func_ea (::ea_t)
+                            ///< \param udm_tid (::tid_t)
+                            ///< \param udmold  (::const udm_t *)
+                            ///< \param udmnew  (::const udm_t *)
+
+    frame_expanded,         ///< A frame type has been expanded/shrank.
+                            ///< \param func_ea (::ea_t)
+                            ///< \param udm_tid (::tid_t) the gap was added/removed before this member
+                            ///< \param delta (::adiff_t) number of added/removed bytes

new type events:

+    lt_udm_created,         ///< local type udt member has been added
+                            ///< \param udtname (::const char *)
+                            ///< \param udm     (::const udm_t *)
+                            ///< \note udm_t::offset may not be calculated yet except of the fixed udt
+
+    lt_udm_deleted,         ///< local type udt member has been deleted
+                            ///< \param udtname (::const char *)
+                            ///< \param udm_tid (::tid_t)
+                            ///< \param udm     (::const udm_t *)
+
+    lt_udm_renamed,         ///< local type udt member has been renamed
+                            ///< \param udtname (::const char *)
+                            ///< \param udm     (::const udm_t *)
+                            ///< \param oldname (::const char *)
+
+    lt_udm_changed,         ///< local type udt member has been changed
+                            ///< \param udtname (::const char *)
+                            ///< \param udm_tid (::tid_t)
+                            ///< \param udmold  (::const udm_t *)
+                            ///< \param udmnew  (::const udm_t *)
+                            ///< \note udm_t::offset may not be calculated yet except of the fixed udt
+
+    lt_udt_expanded,        ///< A structure type has been expanded/shrank.
+                            ///< \param udtname (::const char *)
+                            ///< \param udm_tid (::tid_t) the gap was added/removed before this member
+                            ///< \param delta (::adiff_t) number of added/removed bytes
+

All former structure and enum event types have also been deleted.

mahaloz commented 2 weeks ago

@arizvisa I got into more of the internals today, and my god, this change is brutal in 9. Not only did they kill all the old events as they did in 8.4, but they completely deleted ida_struct... the way we set and get structs is just gone.

This will be a time-consuming refactor for 9.0 support.

mahaloz commented 1 week ago

More info: https://docs.hex-rays.com/pre-release/developer-guide/idapython/idapython-porting-guide-ida-9. Looks like we can keep a lot of the same functionality... hopefully

mahaloz commented 3 days ago

@arizvisa I started doing some porting this weekend, and man, wow, the freaking change of frames for functions is pretty fucked up. I need to rewrite all the stackvar related code I have :(

arizvisa commented 2 days ago

...and man, wow, the freaking change of frames for functions is pretty fucked up.

Yeah, that was my biggest concern :-/. I use my plugin for doing all sorts of structure arithmetic, like finding fields in overlapped frames or laying out structures/frames contiguously. (Sometimes using op_stroff to notate an operand interacting with a frame variable for an entirely different function). These capabilities were literally why I started my plugin like a decade ago... structure/frame layout is pretty damned important imo...

I also use tags to describe how I want the debugger to render structures/frames. This is done by maintaining an index using the "address/id" to track modifications to things in the database. I'm just hoping that in IDA9, I can still assume that _every_ frame/structure/union member will still always have a "tid" so that I can continue to store metadata for things being notated in the idb. However, the general idea of tinfo_t properties being optional, combined with my pessimism (and tinfo_t being somewhat flakey), leads me to believe that everything having an ID isn't something I can rely on anymore.

I need to rewrite all the stackvar related code I have :(

Heh..Thinking about that too, the way I'm handling structure alignment will need to be revisited too since tinfo_t has alignment calculated per-field... So, translating a frame member offset to determine the default member name (var_XX, arg_XX) consistently between the disassemblers' stackvars and Hex-Rays' lvar_t will likely remain damaged. Really, all the Hexrays-specific stuff that I had planned on getting off my plate this year will need to be refactored...if i can find the motivation again.

Anyways, the other major qualm I had was the idea of the FF_BYTE, FF_WORD, FF_DWORD, etc. flags being removed from types. To avoid a user having to specify the correct flag combination to apply to a field, I map pythonic things like (int, 2) to FF_WORD, [(float,8), 0x10] to an array, or (str, 2, 2) to STRLYT_PASCAL2|STRWIDTH_2B for all of the native disassembler types. This simplifies translating an IDA structure/frame/union into something like ctypes/Construct/etc.

Now that these flags are gone for structures/frames/unions, I'll need to come up with a way to map these semantics to tinfo_t. This'll probably consist of looking up the correct compiler type any time one of these pythonic types are being applied to a field. But if they're removing these flags from tinfo_t, whose to say that they don't just do the same for all flags throughout the database (eventually)?

Thank you for fighting with these changes first btw. I'm super grateful for that.