CompArchCam / Janus

Automatic Binary Parallelisation
Apache License 2.0
36 stars 12 forks source link

Janus: Statically Guided Dynamic Binary Modification

Janus is a same-ISA dynamic binary modification tool that is controlled through static analysis. It is developed at the University of Cambridge Computer Laboratory and available under an Apache licence. If you use Janus in your work, please cite our CGO 2019 publication.

Janus first performs an analysis of a binary executable to determine the transformations required. These transformations are then encoded into a series of rewrite rules specific to that binary. Janus' dynamic modifier is implemented as a client of DynamoRIO. It reads these rewrite rules and carries out the transformations as instructed when it encounters the relevant machine code.

Overview

What can Janus do now?

Janus is designed to perform sophisticated modification and optimisation of generic x86-64 and AArch64 ELF binaries. It augments DynamoRIO with a static binary analyser and a runtime client. With a combination of static binary analysis and dynamic binary modification, controlled by domain-specific rewrite rules, Janus is able to perform a series of tasks that other tools might not provide. For example:

What will Janus do in the future?

We are working on Janus to make it better and more useful. Our current plans for improvement include:

=======

Disclaimer

Janus is still a prototype and although we fix any bugs we find, we cannot guarantee fault-free execution or the same parallelisation performance on binaries other than those we have tested on. However, we expect that other binaries (especially those that are small and simple) will give similar speed-ups to those we have seen. We welcome anyone to contribute to this project and help make this tool more useful. Please contact Ruoyu Zhou or Timothy Jones if you have questions.

Components

Analyze

A static binary analyser that examines input binaries and identifies opportunities for optimisation. It then encodes the transformation in a domain-specific rewrite schedule.

JPar

JPar firstly performs a static analysis on the input binary using analyze and generates a rewrite schedule and performs automatic parallelisation on the same binary.

JVect

JVect firstly performs a static analysis on the input binary using analyze and generates a rewrite schedule and performs automatic vectorisation on the same binary. Currently it is only working for small loops and not fully operational.

JFetch

JFet firstly performs a static analysis on the input binary using analyze and generates a rewrite schedule and automatic inserts memory prefetch instructions into the same binary.

JITSTM

JITSTM is a software transactional memory library that is generated at runtime. It redirects generic memory accesses to speculative read/write buffers. The design is similar to JudoSTM. Currently it is not fully operational and its performance is not optimised.

Installation

Janus uses cmake for building all its components. There is no need to install additional libraries. Simply create a new build folder and invoke cmake.

mkdir build
cd build
cmake ..
make -j

You can also build Janus with VERBOSE mode enabled.

cmake -DVERBOSE=ON ..

After building, there are several components generated:

There are a few convenient bash scripts in the janus folder.

For convenience, you can add these scripts into PATH:

export PATH=$PATH:${YOUR_PATH_TO_JANUS}/janus/

Run correctness tests

Once it is built, you can test Janus on your x86-64 or AArch64 machine. Please ensure that you have at least four cores in your test machine. Simply type:

make test

It runs the native binaries first and then runs the Janus paralleliser on the same binary.

Run a single test

To test a specific executable, you can simply invoke the corresponding Janus script.

jpar <num_threads> <executable> <arguments>

For example, to parallelise an executable "2mm" with 4 threads, you can simply type:

jpar 4 2mm

To vectorise an executable "2mm":

jvect 2mm

To prefetch an executable "is":

jfetch 2mm

Currently the joint script for exploiting three kind of parallelism is still under development.

Run on your own executable

Once you add Janus into PATH, you can try Janus on your own binary:

jpar_all <num_threads> <executable> <arguments>

It runs the Janus profiler, static analyser and dynamic paralleliser in one go. It might take a while to do the profiling and analysis. At end it should generate a rewrite schedule so you can reuse it and invoke dynamic paralleliser directly. It is natural that you might find bugs, infinite loops and segfaults. We are slowly working on this to make Janus more robust and useful.

Janus break-down step by step

Janus performs static binary analysis and generates a rewrite schedule to guide the binary modification. The static analyser has lots of rule generation modes:

Usage: analyze + <option> + <executable> + [profile_info]
Option:
  -a: static analysis without generating rules
  -cfg: generate CFG from the binary
  -p: generate rules for automatic parallelisation
  -lc: generate rules for loop coverage profiling
  -fc: generate rules for function coverage profiling (not yet working)
  -pr: generate rules for automatic loop profiling
  -o: generate rules for single thread optimization (not yet working)
  -v: generate rules for automatic vectorization (not yet working)
  -f: generate rules for automatic just in time prefetch
  -v: generate rules for automatic vectorization
  -d: generate rules for testing dll (.so and dynamic loaded library) instrumentation

For example. you can run the static binary analyser:

analyze -p 2mm

A Janus Rewrite Schedule (JRS) file "2mm.jrs" is generated. This file is obfuscated but you can examine the contents using the "schedump" tool.

schedump 2mm.jrs

The rewrite schedule file is actually a list of "rewrite rules" to be interpreted by the dynamic binary modification tool.

The list of rewrite rules can be found in "shared/rule_isa.h".

It also generates the detailed report of the binary analysis in "2mm.loop.log" and "2mm.loop.alias.log".

If the rewrite rule exists along with the binary, you can invoke Janus without re-generating rewrite schedules

jpar 4 2mm

Janus static analyser options

By using the Janus static analyser, you can generate different rewrite schedules for the same binary:

${YOUR_PATH_TO_JANUS}/bin/analyze <options> <binary>

Publications

Please cite our CGO 2019 paper if you use Janus in your own work.

Janus: Statically-Driven and Profile-Guided Automatic Dynamic Binary Parallelisation Ruoyu Zhou and Timothy M. Jones International Symposium on Code Generation and Optimization (CGO), February 2019

The Janus Triad: Exploiting Parallelism Through Dynamic Binary Modification Ruoyu Zhou, George Wort, Márton Erdős and Timothy M. Jones International Conference on Virtual Execution Environments (VEE), April 2019

Development Notes

Documentation

The Janus development documentation can be built using DoxyGen. You can follow the steps described here to install DoxyGen.

Then run (in the root folder of the project):

doxygen Doxyfile

This generates the html documentation into the docs/html folder. Open docs/html/index.html to start browsing.

Source structure

Debug

The debugging build will print event information to help you debug.

cd build
cmake -DVERBOSE=ON ..
make -j

You can also invoke jpar with gdb

${YOUR_PATH_TO_JANUS}/janus/jpar_debug <num_threads> <binary> <arguments>

You can rebuild with the following macros to generate a SIGTRAP at specific locations:

How to visualise loops in your executable

Janus' static analyser has -a mode which can dump the information it retrieves from static analysis. It generates the CFG in the dot format so that you can visualise it in pdf format. From the pdf you can understand the binary better.

sudo apt-get install graphviz pdftk

#run static analyser
analyze -a exe
#run loop cfg generation
graph exe.loop.cfg
#run loop ssa generation
graph exe.loop.ssa
#run function cfg generation
graph exe.proc.cfg
#run function ssa generation
graph exe.proc.ssa
#print the output pdf
evince exe.loop.cfg.pdf &

Instrumenting Shared Object (compiled as PIC) Dynamically-Loaded libraries

You will have to generate rewrite schedules individually for the main binary and each of the shared-object library modules that you wish to instrument.

analyze -d <binary.so>

Once separate rewrite schedules ( .jrs files) have been generated for each of the module you want to instrument with janus, you can then run the dynamic client. See dynamic/dll/dll.cpp for an example on how to loadthe rules from rewrite schedules for all the modules. **NB: this functionality is currently under test.