BinaryAnalysisPlatform / bap

Binary Analysis Platform
MIT License
2.07k stars 273 forks source link

Adds the `libraries` parameter to the `disassemble` command #1547

Closed bmourad01 closed 2 years ago

bmourad01 commented 2 years ago

It may be useful to link two BIR programs together, especially when running dynamically linked binaries with Primus. In particular, it may be useful to execute external library code without writing Primus Lisp stubs, if the exact behavior of this code is relevant to the analysis task at hand.

The libraries parameter allows the user to load any libraries that may be dynamically linked at runtime. Each of these binaries are loaded as Theory.Unit.t objects into the Knowledge Base, disassembled, and then linked together into a final program term. We can then re-use the functionality of the stub-resolver to redirect stubs to their library implementations.

Example usage:

bap /bin/ls --libraries libc.so.6

To do this, we had to rework the Project.t data structure under the hood to refer to the "main" binary, as well as the "libs" that were loaded in relation to this binary. This distinction is (or should be) made explicit in the documentation as well as the naming of some new APIs in bap.mli.

bmourad01 commented 2 years ago

There are some questions to consider:

  1. Would it be a problem if mybinary and mylibrary, have overlapping virtual address spaces? e.g. Function foo in mybinary and function bar in mylibrary both have the address 0x40000?
  2. The API/ABI resolver passes look like they have the autorun attribute, but they only run once when we load mybinary prior to linking. It seems that currently this information doesn't get resolved for the rest of the units that we lift (such as mylibary). Should we override this behavior so that we can run these passes again?
bmourad01 commented 2 years ago

Updated with the new approach after some discussion.