williballenthin / python-idb

Pure Python parser and analyzer for IDA Pro database files (.idb).
Apache License 2.0
455 stars 73 forks source link

Coding style, transparency to idapython plugins and redundant code #65

Open invano opened 5 years ago

invano commented 5 years ago

Hi,

I always used python-idb for very quick stuff and it works great. However, I'd like to run some idapython scripts without having the IDA open/close bottleneck. I'm working on the implementation of some missing idaapi and idc routines but I need some clarifications before I open any pull request.

Constants coming from op_t and exposed through the ida_ua module, like o_near, o_imm, o_reg and so on, can be accessed using both idaapi and idc... in other words, a hypothetical class ida_ua would define o_imm = 5, then inside class idaapi we get o_imm = self.api.ida_ua.o_imm. Same for class idc.

This is just an example for a more general question. What is the coding rule for this kind of situation? If we mirror everything—almost 1:1—from IDAPython, then we achieve full transparency to IDAPython users and scripts at the price of redundant code. On the other hand, if we want the code to be a bit more polished, we might break compatibility with existing IDAPython scripts lying around.

invano commented 5 years ago

Well, I'm slowly realising all this redundant disorder has been boosted one level further with the introduction of modern IDAPython modules. Maybe it's better to stick with the modern interface and expect that IDAPython scripts get "modernised" as well.

williballenthin commented 5 years ago

hey @invano

yeah, you've highlighted a dark corner of this library that i've avoided thinking about too much :-) i often find myself cringing while implementing the IDAPython API, because there are things i'd like to change, but can't for API compatibility (e.g. returning None vs raising exceptions, etc).

in any case, i think we should generally try to stick to "modern" IDAPython scripting, and primarily place constants within the "most specific module" (e.g. avoid declaring them in idc). i'm ok with re-importing into idc and friends for compatibility.

the existing code probably doesn't do this too well - sorry about that!

side note: crazy idea: we should be able to programmatically enumerate all constants across all IDAPython modules and use that to generate code. maybe this is better than manually importing constants as necessary.

invano commented 5 years ago

Same holds for methods and so on. I'm digging a bit inside IDAPython and it seems each module gets it own constants/methods, then there is idaapi mostly performing as a huge wrapper around the inner modules and finally idc, randomly wrapping stuff coming from idaapi again...

Consider get_item_end(ea) as a quick example, defined in ida_bytes but also present in idaapi and idc:

Python>print idaapi.__dict__["get_item_end"]
<function get_item_end at 0x10c685c08>
Python>print ida_bytes.__dict__["get_item_end"]
<function get_item_end at 0x10c685c08>
Python>print idc.__dict__["get_item_end"]
<function get_item_end at 0x10e11cde8>

I agree that manually importing everything would be a mess. Your crazy idea makes actually sense to me!

About modern/ancient API and code, there is also the API transition to IDA >= 7.0 to consider (https://www.hex-rays.com/products/ida/7.0/docs/api70_porting_guide.shtml). How do you usually behave with this? For example, I implemented ida_ua.print_operand() and ida_ua.get_operand_type() which were respectively called idc.GetOpnd() and idc.GetOpType() and are still exposed by idc for compatibility. Or idc.find_func_end() exposed also as idc.FindFuncEnd().

XVilka commented 4 years ago

@invano I saw your code in https://github.com/williballenthin/python-idb/compare/master...bjchan9an:master Maybe it's ready for the pull request?