maurer / libbap

C Bindings for BAP
1 stars 0 forks source link

libbap is not self contained #4

Open maurer opened 7 years ago

maurer commented 7 years ago

Currently, you do not need only libbap.so and optionally, bap.h in order to use libbap, instead you must have ocamlfind and a variety of BAP plugins installed. It should be possible, pending future interaction with the BAP project, to link this all into one binary object, rather than doing OCaml dynamic loading.

maurer commented 7 years ago

Blocked on BinaryAnalysisPlatform/bap#570

ivg commented 7 years ago

It should be possible, pending future interaction with the BAP project, to link this all into one binary object, rather than doing OCaml dynamic loading.

It should be, theoretically, but this is not what we are going to do. You should understand one particular thing: BAP is not a library, but a framework, i.e., it is a collection of libraries, frontends, utilities, and plugins, where the latter is sort of a mix between a library and a frontend. What you're trying to do, is to pack these all into one shared object, and hope, that this object would be self-contained. BAP framework provides much more than the lifter, and it uses the dependency injection to glue everything together. And it is hard to implement the dependency injection while staying totally in the shared object constraint. There are further more issues, that do not fit into the shared object, that we can discuss in later.

The root of the problem, as I see it, is that you're not really want to use BAP, but just the disassembler, and lifters. Moreover, you don't want to be constrained by the rules of BAP framework, as you're trying to build your own framework. I don't see anything illegal in you desires, though this particular use case isn't really anticipated by the design. I see that a correct path is not to try to ambitiously create a libbap.so that will be an interface to the BAP framework, but instead to create few small bindings to the libraries (not to the plugins), and then glue them in the way that you suits you.

The disassembler library is already a C++ library, that is then wrapped into the C interfaces, so you can actually use it out of box, without any interfering with OCaml. The lifter, is basically a function of type:

mem -> Disasm_expert.Basic.full_insn -> bil Or_error.t

Here the full_insn is an abstract (coinductive) type with a private constructor. Only the disassembler can create it. However, underneath the hood, it is the following data structure:

 type ins_info = {
    code : int;
    name : string;
    asm  : string;
    kinds: kind list;
    opers: Op.t array;
  }

Basically, we may add a public constructor, so that you can cook the required data directly from the C interface, and then construct the full_insn value, and finally call a lifter. You should decide on your own, how you will actually add new lifters to your framework, and how you will dispatch between existing lifters, and prioritize, and so on. (Basically, you will reimplement some of the framework code in the way you like). You want to use this approach we can help you, and release lifters as libraries. However, my personal opinion is that it is better to play with BAP on BAP terms and try to use it as it was designed. For example, I would expect, that at some point of time, you may want to get the IR (the CFG reconstruction) out of BAP, or even some analysis, and then you will find yourself reimplementing the framework. So why spend your time and not to use it, as it is already written?

maurer commented 7 years ago

I suppose we didn't keep records, but when the plugin framework was first proposed, I explicitly asked if the use of BAP as a library was going to be preserved and was told yes.

There are several reasons I don't want to use the dynamic-loading based BAP, and was hoping to just link in the libraries directly:

1.) This precludes the eventual creation of a self contained binary. 2.) This massively increases the number of runtime dependencies. 3.) Installing my code suddenly requires use of the OCaml toolchain and 30 minutes of waiting for the user.

As far as dependency injection goes, I don't see any reason why the code organization mandates dynamic plugin loading rather than library linking - it certainly doesn't in other languages and projects where dependency injection is used.

I'm trying to use BAP "on its terms", but every feature I use (linking as a library, invoking byteweight) seems to get papered over on each new release, and I'm told I shouldn't be using that for some reason.

I may eventually do as you suggest, and just grab pieces of BAP and glue them back together myself. I'm keeping this issue open because I still want my library to be self contained, even if BAP doesn't want to be contained.

ivg commented 7 years ago

I suppose we didn't keep records, but when the plugin framework was first proposed, I explicitly asked if the use of BAP as a library was going to be preserved and was told yes.

It is still yes. The self-contained property is what I never promised. You can still have a .so file, that will provide an API entry points to BAP. The extra constraints, like removing dynamic loading of the plugins, dependency on ocamlfind, etc, weren't specified.

There are several reasons I don't want to use the dynamic-loading based BAP, and was hoping to just link in the libraries directly:

1.) This precludes the eventual creation of a self contained binary. 2.) This massively increases the number of runtime dependencies. 3.) Installing my code suddenly requires use of the OCaml toolchain and 30 minutes of waiting for the user.

Well, there are always tradeoffs. The issues, that you specify, are anticipated and are not considered serious for us. If you want to speed up the process, then you can use docker, or prebuilt bap in advance.

As far as dependency injection goes, I don't see any reason why the code organization mandates dynamic plugin loading rather than library linking - it certainly doesn't in other languages and projects where dependency injection is used.

Because otherwise, installation of any new feature will require the recompilation of the whole BAP universe.

I'm trying to use BAP "on its terms", but every feature I use (linking as a library, invoking byteweight) seems to get papered over on each new release, and I'm told I shouldn't be using that for some reason.

It is simply not true. First of all, the stability of 0.9.x was never guaranteed. We didn't break anything in 1.1.0. The byteweight part was last changed a year ago. The same is actually true, about the plugin system. It was merged on Feb 29, 2016, and wasn't changed since then, other than bug-fixing.

I may eventually do as you suggest, and just grab pieces of BAP and glue them back together myself. I'm keeping this issue open because I still want my library to be self contained, even if BAP doesn't want to be contained.

I totally respect you desire, that's why I'm suggesting you another route, as BAP as platform would never fit into a standalone .so file. So do not expect, that current state will be fixed per se. But if you will just try to bind to libraries, and use them without a platform, then we will be happy to meet you in the middle, and help by exposing more functionality as libraries. But, you should understand, that this would not be the bindings to BAP, but bindings to specific libraries.