passlab / DCFG

Dynamic ControlFlow Graph and DataFlow Graph for Binary-based Optimization
18 stars 4 forks source link

Binary Analysis and Optimization

We aim to build a tool for binary analysis and optimizations and our first usage is binary instrumentation for software prefetching.

The following three examples are used as basis for our development:

1. pinplay simple loop dependency example from pinplay tutorial during PLDI 2016

The example in simple_loop_dependency-pinplay folder shows step-by-step how to generate dynamic control flow graph, loop region, and dynamic slicing using pinplay. The example currently only works on fornax due to some bugs mentioned in the README file for Ubuntu. The folder currently contains both the orginal source files and all the output files after the complete execution on fornax. check the README file for the scripts used to generate the output.

2. pintool examples for memory tracing, edge count of basic block and call tracing

The pin_examples folder contains three usefule examples, memory trace(pinatrace), edgecnt and calltrace for help understanding how to use pin to trace memory access, basic blocks and function calls.

3. dyninst example for statically generating static control flow graph

The dyninst_CFG folder contains sources for generating dot-based CFG using dyninst and you need to follow dyninst installation guild to make it work. Check the dyninst_CFG/README.md file.

=============================================================

General Steps

  1. Use pinplay (check simple_loop_dependency-pinplay folder) to generate DCFG, loop region and slicing information. Profiling (e.g. using PAPI or cycle info from pin if it can), and graph and binary analysis may be needed to identify the hotspot of performance loss (e.g. poor locality, cache/bus contention, etc).
  2. Use pintool pinatrace to generate traces of memory access (R/W and size) of instructions.
  3. Analyze the pinatrace memory traces and append the needed <R|W> info to the memory-access instructions of the dyninst-loops of a function
  4. For the memory access (particularly Load) instruction, perform analysis using SLICING to create links of the instructions who loads data and the instruction who uses that data (Load-USE relationship).
  5. Perform cycle-distance analysis of the Load-USE instructions to identify the delay of USE instruction if the the Load is NOT from the cache.
  6. For software prefetching, identify the slot for inserting the prefetching call for the Load
  7. Extends the work to parallel program

Others that may be useful

  1. Use Dyninst interfaces for retrieving function, loops and loop nest, and static CFG for a binary program. We will start with the CFG.cpp file for the rest of the development.
  2. Analyze the edgecnt traces and append # of calls of each edge to the edge in the static CFG of the dyninst-loops of a function.
  3. Use Dyninst DataFlowAPI and SLICING