SRI-CSL / OCCAM

OCCAM: Object Culling and Concretization for Assurance Maximization
BSD 3-Clause "New" or "Revised" License
26 stars 10 forks source link

Adding support for OCCAMIZE libraries #29

Closed RafaeNoor closed 3 years ago

RafaeNoor commented 4 years ago

Initializing support for OCCAM to specialize libraries by specifying which functions a user will need from a particular library

RafaeNoor commented 4 years ago

Currently the way to invoke the OCCAMize Library functionality is to pass the name of the library bitcode as the main bitcode in the manifest. Then the user would invoke slash.py with the flag --entry-point=functionName1,functionName2,functionName3. The razor.py python wrapper would stop before the linking stage if this flag is passed.

RafaeNoor commented 4 years ago

I've added some tests to test/liboccamize, with some basic description of the behavior of the pass

RafaeNoor commented 4 years ago

There are two modes for library (bitcode) specilization in OCCAM now: 1) The --entry-point=func1,func2,... flag to the slash.py script where the main module in the manifest is the library bitcode. In this case OCCAM would stop right before the link time stage. This can be leveraged when users know which functions they want from a library and they can manually specify. 2) A new 'lib_spec' key in the manifest where the main module would be the main program which will use functions from the library to be specialized and the libraries which will be specialised would go in the list again 'lib_spec'. Here OCCAM would automatically identify the external functions in the main module and pass them as the entry point functions to the library bitcodes. In this case, slash.py would go on to link the modules into a final program.

RafaeNoor commented 4 years ago

To generalize the entry-point specialization of libraries, OCCAM now supports cases where multiple prototypes can exist for the same function name. This is possible with C++ programs, however LLVM mangles these names to include type info as well. E.g. int add(int a, int b) becomes _Z3addii. Support has been added which constructs a map of demangled function names to a vector of mangled function names. Here for example there are two functions with the name add: int add(int a, int b) int add(int a, int b, int c)

Checking from occam.log file you would see the following:

| Printing Function Name Map | add: _Z3addii _Z3addiii ...

Hence, specifying entry point names would now capture all the functions with the same demangled name.

RafaeNoor commented 4 years ago

I have made changes according to your comments. There are a few things which are giving some issues which I will look over in detail:

  1. I tried using raw_fd_ostream however due to some reason, llvm was marking it's constructor implicitly deleted due to some reason (having to do with how llvm was built). I'll look into that. Right now I'm using the ostream object meanwhile.

  2. I've replaced some uses of std::string with StringRef, however in the DummyMainFunction.cpp pass when I store it in a vector and try to access again it doesn't behave properly. I've even tried inlining the function which creates this vector incase the issue was with copy constructors being called on return from a function but that didn't fix. I checked the reference manual and generally they don't recommend storing StringRef.

With regards to functionality:

Dr. Ashish requested a configuration where libraries could be specialized according to sets of application programs. Nibbler has a library debloating functionality which provides this. This functionality has been incorporated into OCCAM via a new key in the manifest main_spec. Like lib_spec corresponded to the list of library bitcodes which needed to be specialized, main_spec is a list of application program bitcodes which would be used to idenitfy a union of uses from the libraries in lib_spec and specialize the libraries accordingly. A test-case has also been pushed to show this usage.

The resultant library bitcodes would be usable for all the application programs specified in the main_spec key.

caballa commented 4 years ago

@RafaeNoor : what is the status of this PR? you need to resolve the conflicts. Probably is the renaming in some of the manifest keys

caballa commented 3 years ago

Manually merged in commit b50608f