If you use this code in your research, please cite our FPGA'17 paper:
@article{zhao-bnn-fpga2017,
title = "{Accelerating Binarized Convolutional Neural Networks
with Software-Programmable FPGAs}",
author = {Ritchie Zhao and Weinan Song and Wentao Zhang and Tianwei Xing and
Jeng-Hau Lin and Mani Srivastava and Rajesh Gupta and Zhiru Zhang},
journal = {Int'l Symp. on Field-Programmable Gate Arrays (FPGA)},
month = {Feb},
year = {2017},
}
bnn-fpga is an open-source implementation of a binarized neural network (BNN) accelerator for CIFAR-10 on FPGA. The architecture and training of the BNN is proposed by Courbarieaux et al. and open-source Python code is available at https://github.com/MatthieuCourbariaux/BinaryNet.
Our accelerator targets low-power embedded field-programmable SoCs and was tested on a Zedboard. At time of writing the error rate on the 10000 images in the CIFAR-10 test set is 11.19%.
You will need Xilinx SDSoC on your PATH and the Vivado HLS header include files on your CPATH.
Verified SDSoC versions: 2016.4, 2017.1
With these tools in place do the following from the repository root:
% source setup.sh
% cd data; ./get_data.sh; cd ..
% cd params; ./get_params.sh; cd ..
This will set environment variables and download data and parameter zip files.
If the get scripts do not work, please go to this Google Drive to download the files that need to go into the data and params folders.
To build the software model:
% cd cpp
% make -j4
To build the FPGA bitstream do (with the software build complete):
% cd cpp/accel/sdsoc_build
% make -j4
Post-route timing and area information is available in sdsoc_build/_sds/reports/sds.rpt.
The master branch contains a debug build including a random testbench, a per-layer testbench, and a full bnn testbench. The optimized branch contains only the full testbench.
% cd mnt
% export CRAFT_BNN_ROOT=.
% ./accel_test_bnn.exe <N>
Where N is the number of images you want to test. Up to 10000 images from the CIFAR-10 test set are available. The program will print out the prediction accuracy and accelerator runtime at the end. Note that the program performs weight binarization and reordering before invoking the accelerator so there will be a pause at the very beginning.
Go to cpp/accel/Accel.h and change CONVOLVERS to the desired number (must be a power of 2). You must do a make clean and rebuild everything from scratch.