immunant / ibresolver

A QEMU TCG plugin for resolving indirect branches.
BSD 3-Clause "New" or "Revised" License
4 stars 3 forks source link

Remove need to pass in indirect callsites #3

Closed ayrtonm closed 2 years ago

ayrtonm commented 3 years ago

It'd be nice to have this tool find indirect callsites automatically instead of passing in a list of callsites. The two options are

  1. Shell out to objdump and grep for indirect branches like in findindirect*.sh. The grepping could probably be done from the plugin in the post-install initialization, but using objdump would require cross-compiling binutils for the target arch which isn't as user-friendly. I'd also need to expand the arm regex since it's missing some indirect jumps (e.g. ldr pc, Rn)
  2. In the translate block callback pattern match each instruction against the target arch's indirect branches. For x64 this is reasonable since only unconditional jumps/calls can be indirect so there are only a few patterns to check. I'm not sure how involved this would be for ARM (i.e. how many patterns we'd have to match against), but insn sizes are limited to 2 or 4 bytes and the instruction encoding manual is much easier to follow than intel's.

Fixing this issue also means we don't have to pass in the binary name twice as explained here so the plugin would only require one arg for the output file.

ayrtonm commented 3 years ago

A more practical approach suggested by @thedataking is to use binary ninja to detect indirect branches. This should probably be done from the translate block callback like in option 2, but using binja instead of pattern matching the insn bytes. Detecting indirect branches ahead-of-time would be lower overhead, but it would rely on binja's code discovery capabilities which might miss cases on x86-64 (though it'd probably be ok on arm). Doing this just-in-time would also be simpler to implement since the interaction with binja would be minimal.

To implement this we'd pass the output of qemu_plugin_insn_data to binja. Also we can use binja directly from C++ like in this example.

ayrtonm commented 3 years ago

Another idea suggested by @travitch is to make an interface for using different backends to check if a given instruction is an indirect branch or not. This might look like this

where init_backend returns a boolean indicating if the architecture is supported. Binja is probably the way to go for now, but it'd be good to avoid integrating too tightly with binja and stick to this simple interface.