A simple loader which allows us to grab small portions of a binary to load and patch, without having BAP do heavy lifting of a whole binary.
This should also allow us to load "raw" binaries without symbol table information, but that is untested for now.
A rough overview of the patch
We create and register a new loader for BAP, which mostly involves creating an OGRE file which gives ELF-like data about the binary, enabling BAP to carry out the disassembly phase and produce IR.
The main difference is that we get our data from the config.json file rather than the command line input. The input involves mainly telling which bytes to disassemble, involving an architecture, a base address, an offset and a length.
It's worth noting that even small mistakes in the input parameters can give confusing errors, e.g. missing function symbols (resulting in "function not found" errors), or bogus IR if, say, the specified length is too small.
We haven't yet added a way to specify function symbols, though this should be relatively easy, and is possible through other means.
The lifter is only registered if the options are present in the config file, and is only invoked if it is registered.
This patch should allow for dramatic speed up if the binary is large, but the patch "area" is small. Currently we only allow for a single region, which should include all the patch points + length (this should be fixable as well).
A simple loader which allows us to grab small portions of a binary to load and patch, without having BAP do heavy lifting of a whole binary.
This should also allow us to load "raw" binaries without symbol table information, but that is untested for now.
A rough overview of the patch
We create and register a new loader for BAP, which mostly involves creating an OGRE file which gives ELF-like data about the binary, enabling BAP to carry out the disassembly phase and produce IR.
We mostly copy how it's done in the raw plugin, implemented here: https://github.com/BinaryAnalysisPlatform/bap/blob/master/plugins/raw/raw_main.ml
The main difference is that we get our data from the
config.json
file rather than the command line input. The input involves mainly telling which bytes to disassemble, involving an architecture, a base address, an offset and a length.It's worth noting that even small mistakes in the input parameters can give confusing errors, e.g. missing function symbols (resulting in "function not found" errors), or bogus IR if, say, the specified length is too small.
We haven't yet added a way to specify function symbols, though this should be relatively easy, and is possible through other means.
The lifter is only registered if the options are present in the config file, and is only invoked if it is registered.
This patch should allow for dramatic speed up if the binary is large, but the patch "area" is small. Currently we only allow for a single region, which should include all the patch points + length (this should be fixable as well).