This project makes use of the code provided from the 5th Championship Branch Prediction
competition held in 2016 (CBP-16). An augmented simulator is created here for studying
modern branch prediction units (BPUs). For the uninitiated, a BPU plays an important role
in increasing processor instructions per cycle (IPC) by speculatively executing instructions
before a branch instruction is resolved. The two main components of this work is via the
programs simnlog
and simpython
. These facilitate a deeper understanding of BPU
performance as well as opens the door to BPU simulation from a friendlier programming,
Python
. Many machine learning toolkits, and thus many ML practitioners, are based
around the Python
ecosystem. We provide scripts in this language for the analysis of BPU
performance as well as a C++
simulator that instantiates and runs Python
-written BPUs.
BPUs written in a high-level framework, e.g., TensorFlow
or scikit-learn
, can be
evaluated using the 200 training traces and 440 evaluation traces from CBP-16 (or other
traces in the BT9 format).
data
folder.sudo
password for dependency installation (e.g.,
boost libraries) for setting up the original CBP-16 code and pulling in trace files:
cd scripts
./setup_cbp16.sh
cd ..
Python
bindings and simulator that
logs prediction data to binary files.
cd cbp16sim
make
# Or if you only want to use program or the other
#make simnlog
#make simpython
make clean
cd ..
Python
scripts in the scripts/
directory,
you'll need to install the required libraries.
pip install -r requirements.txt
After following the installation instructions, you can run this program from the
cbp16sim
directory. This program allows for the simulation and evaluation of
Python
-based BPUs. By default, the program runs a dummy predictor that (excessively)
logs trace inputs and always predicts taken. To run the program:
$ cd cbp16sim
$ ./simpython
usage: ./simpython <trace> [<predictor_module>]
$ # Example usage (for default dummy predictor):
$ PYTHONPATH=src/simpython/ ./simpython ../cbp2016.eval/traces/LONG_SERVER-1.bt9.trace.gz
$ # Example usage (for custom my_predictor.py with PREDICTOR class in the same directory):
$ PYTHONPATH=. ./simpython ../cbp2016.eval/traces/LONG_SERVER-1.bt9.trace.gz my_predictor.py
Setting the PYTHONPATH
environmental variable is important to informing the program where
your BPU module is located. This will be need to set by you unless your program is on a
Python
standard library path (e.g., where packages from pip
are installed).
To create your own module, you will need to inherit from the BASEPREDICTOR
abstract
base class located in cbp16sim/src/simpython/predictor.py
. For example usage,
see how the methods are implemented in the dummy_predictor.py
file in the same
directory. You will need to name your class that inherits from BASEPREDICTOR
a special name: PREDICTOR
. This is the name that the program looks for. Minimally,
your BPU must implement the GetPrediction(...)
method, but you may also want to
implement UpdatePredictor(...)
(which updates the predictor with the actual taken
direction) and/or TrackOtherInst(...)
to track unconditional branches.
After following the installation instructions, you can run this program from the
cbp16sim
directory. Note that by default the program is compiled to run the TAGE-SC-L
BPU (winner of CBP-16 in all categories). To change the BPU to one of the other submissions,
you'll need to replace the corresponding predictor.cc
and predictor.h
files in the
cbp16sim/src/simnlog
directory. Primitive but functional. Here is some example usage:
$ cd cbp16sim
$ ./simnlog
usage: ./simnlog <trace>
$ # Example usage:
$ ./simnlog ../cbp2016.eval/traces/LONG_SERVER-1.bt9.trace.gz
The program generates somewhat large binary files that log relevant branch data and predictions. If you want to generate these logged files in bulk, you can run something like the following (this only looks at short traces):
find ../cbp2016.eval/evaluationTraces/ -iname 'SHORT_*.gz' | xargs -n 1 ./simnlog
If you want to get fancy and have the CPU compute power to handle it, you can run the
program in parallel via xargs
by xargs -n 1 -P 8
- this tells xargs
to run 8
instances of the program in parallel for the next 8 inputs given by find
.
Afterwards, if you would like to generate plots of the data and perform other analyses,
you can run some of the scripts from the scripts/
directory. Before running simnlog
,
you can analyze the results files from previously generated runs using the original CBP-16
simulator or the extracted results from the CBP website (both stored in
cbp2016.eval/results
). This can be simply run using the analyze_cbp16_results.py
script. To look at the generated binary files, you will first need to aggregate statistics
using the process_traces.py
script. The parameters of the file will default to look at
generated .dat
files and store results in a processed_traces
directory. You will need
to take a look in the file to make directory modifications at the moment. After, you can run
the analyze_processed_trace.py
script to generate plots (like the one above and below) and
some statistics.
If you want to play with the generated Python
files yourself, here is the boilerplate
code you should follow.
# The field names of the generated binary format. Each element is a struct of 24 bytes
# containing the following data.
names = 'branchTaken', 'predDir', 'conditional', 'opType', 'branchTarget', 'PC'
# These are the numerical formats for each piece of data (first three are Booleans)
formats = 'u1', 'u1', 'u1', 'u4', 'u8', 'u8'
# The memory offsets for each element (note that there is 1 byte of padding at offset 3)
offsets = 0, 1, 2, 4, 8, 16
# Creation of the NumPy dtype
import numpy as np
bpu_dtype = np.dtype(dict(names=names, formats=formats, offsets=offsets))
After, you can load in the file and even treat it as a pandas DataFrame, the keys being the 6 field names in the above code.
with open('filename.dat', 'rb') as f:
data = f.read()
a = np.frombuffer(data, bpu_dtype)
import pandas as pd
df = pd.DataFrame(a)