Open arizvisa opened 6 years ago
This is actively being worked on by deprecating the interface.reftype_t
type in favor of a better implementation in interface.access_t
as part of #158. This new type allows one to interact with the access type of a reference and makes the attribute mutable so that you can use it in various places and combine them. From this, you'll be able to identify what an operand is doing specifically which can be combined with information from references or even Hex-Rays (that second one might be a pipe-dream though, as it's possible but I haven't found a use for something like that yet).
The "persistence-refactor" branch is currently using interface.access_t
to track both the modifications of an operand and any of the references. This is done with operands by returning an interface.opref_t
which includes the access as the third element of its tuple. Similarly, this is done with interface.ref_t
using its second parameter to store the access.
Both of the tuple types can be treated as a container of interface.access_t
which allow for container operations to be used. Integer operations can then be used to act on whatever the address being contained is. This allows both references and operands to be treated the same way (with the same functions), and then allow you to use a function if you want to "dereference" them.
Currently, the database.address.nextreg
and database.address.prevreg
functions still retain their original functionality. However, a couple more functions have been added to the instruction
module which allow you to filter all of a instruction's operands using either their type or access. I find this a lot easier to use in one-liners due to being able to use filters for checking membership. Currently this is being used to distinguish an immediate branch from a loaded branch, operand writes from operand stores, and operand assignments from operand loads.
Currently the semantics of register matching is based on what IDA thinks an operand is doing due to register matching's usage of
instruction.op_state
. In IDA, an operand is either read from, written to, or both. And so, register matching (database.address.nextreg
,database.address.prevreg
,function.chunk.register
,function.block.register
, etc.) is based on what IDA thinks the operand is doing.Unfortunately this is wrong because things such as operand "phrases" are not actually writing to their registers. Actually, the operand as a whole is written to, but the registers themselves are actually "read from". A hack for this was in place originally as the
instruction.ir
namespace, but this was deprecated and eventually removed because it was a terrible Intel-only hack.The
regmatch
helper ininternal.interface
should be re-implemented so that it properly identifies if a register is actually being read from or modified in some way. This means that phrases (or the symbols within the phrase, really) need to be checked if they're referencing a register, and if so then theregmatch
helper should terminate, return true, or whatever it does.Once this is fixed, then
database.address.nextreg
anddatabase.address.prevreg
can probably be modified to terminate when trying to locate an instruction that reads from a register which was overwritten by a prior instruction (register has gone out of scope). Unfortunately without being 100% certain you're in a function and have a flow chart of how the code is to be executed, there isn't a reliable way to figure this out. Building the control flow graph on each call is obviously out of the question, and caching it is kind of extreme. It'd be nice if we could do this properly without being in a function.Maybe a better way to determine a register value's scope (rather than changing the semantics of these two functions) would be to expose a general combinator that a user can pass as a predicate. I think I have one in a database somewhere that does this already, but it would need to have a good intuitive name and then tested properly as it's such a weird thing to do when you don't have a flow-chart for what it is you're matching for.