IntelLabs / t2sp

Productive and portable performance programming across spatial architectures (FPGAs, etc.) and vector architectures (GPUs, etc.)
Other
29 stars 12 forks source link
compiler dsl fpga gpu language performance portability productivity systolic-arrays

DISCONTINUATION OF PROJECT

This project will no longer be maintained by Intel.
Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.
Intel no longer accepts patches to this project.
If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the open source software community, please create your own fork of this project.

T2SP (Temporal To Spatial Programming, previously called T2S) enables software programmers to build systolic arrays for dense tensor computes with portable performance across spatial architectures (like FPGAs) and vector architectures (like GPUs) in a constructive way.

T2SP is available under a permissive license, the BSD+Patent license.

Currently, we support only Intel FPGAs and GPUs. We assume your device is local to you, or within Intel DevCloud, and the operating system is Linux (We have tried Ubuntu 18.04 and CentOS 7.9, but our system is not really tied to any specific Linux system or version). Other platforms might also work, although not tested.

Our newest paper, Lasa: Abstraction and Specialization for Productive and Performant Linear Algebra on FPGAs (to appear in FCCM 2023), is currently a separate project released at pku-liang/Lasa.

[DevCloud] Open an account (once)

Clone T2SP (once)

   git clone https://github.com/IntelLabs/t2sp 

Install tools (once)

Note:

Modify the environment setting (once)

The environment setting file is in $HOME/t2sp/setenv.sh.

  GCC_PATH=...
  export LLVM_CONFIG=...
  export CLANG=...

Open a terminal on a compute node

[DevCloud] from the head node, log into a compute node:

[Local] Open a bash shell

For all the steps below, we assume you are either on a compute node of DevCloud or on a local machine, except explicitly stated otherwise.

Set up the environment (whenever a terminal is open)

cd $HOME/t2sp
source ./setenv.sh (devcloud|local) (fpga|gpu)

The options say if you are working on DevCloud or locally, and to use an FPGA or a GPU.

Build T2SP (whenever you change the source code)

cd $HOME/t2sp/Halide
make -j

Regression tests

Currently the regressoin tests are for FPGAs only. On a machine with an FPGA,

cd $HOME/t2sp/t2s/tests/correctness
./test.sh

After the testing, each sub-directory there will contain a success.txt and/or failure.txt, which have the command lines for compiling and running every test. These tests are small examples one can play with.

To remove all the temporary files generated during the regression testing:

./test.sh clean

Performance tests

Current release contains SGEMM, 2-D convolution and Capsule convolution on Arria 10 FPGA and GEN 9.5 GPU. For every kernel, we write a single specification that gets mapped to the different kinds of hardware. This reflects our concept of "write a kernel once, and run with high performance across spatial and vector architectures".

Summary of throughput:

A10 S10 GEN 9.5 GEN 12
SGEMM 620 GFLOPS, 97% DSP efficiency 1790 GFLOPS, 99% DSP efficiency 410 GFLOPS, 90% machine peak 2165 GFLOPS, 85% machine peak
2-D convolution 605 GFLOPS, 99% DSP efficiency 1509 GFLOPS, 99% DSP efficiency 421 GFLOPS, 92% machine peak 2236 GFLOPS, 88\% machine peak
Capsule convolution 568 GFLOPS, 96% DSP efficiency 885 GFLOPS, 56% DSP efficiency 398 GFLOPS, 87% machine peak 1850 GFLOPS, 73\% machine peak
PairHMM 41.8 GCups, 95\% PE efficiency 47.9 GCups, 93\% PE efficiency 4.25 GCups 14.8 GCups

To reproduce the performance,

cd $HOME/t2sp/t2s/tests/performance

then

Note:

Features

The current release contains the following features:

Tutorials

A 10-minute video introduces the basic concept of T2SP. There is an initial version of programming guide. There are also a set of tutorials at DevCloud.

Citation

If you use T2SP, please cite the following position paper:

@article{T2SP,
  author    = {Hongbo Rong},
  title     = {Programmatic Control of a Compiler for Generating High-performance Spatial Hardware},
  journal   = {CoRR},
  volume    = {abs/1711.07606},
  year      = {2017},
  url       = {http://arxiv.org/abs/1711.07606},
  archivePrefix = {arXiv},
  eprint    = {1711.07606},
  timestamp = {Mon, 13 Aug 2018 16:46:47 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1711-07606.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org},
  note      = {Open source available at https://github.com/IntelLabs/t2sp}
}

Publications

Acknowledgement