SpRegTiling / sparse-register-tiling

8 stars 2 forks source link

Warning

We intend on creating a more accessible api for integrating the SpMM kernels into other projects and recommend waiting for that (over working with this codebase) if possible.

Building and running from source

ensure your machine has avx512vl

lscpu | grep avx512vl

clone the repo

git clone https://github.com/SpRegTiling/sparse-register-tiling.git --recurse-submodules
cd sparse-register-tiling

If cloning submodules fails, e.g. due to github key setup, you may clone the the submodule manually by:

git clone https://github.com/LucasWilkinson/spmm-nano-kernels.git spmm_nano_kernels

Download DLMC

Download the dlmc dataset, this will download it to ../dlmc

sh download_dlmc.sh

Install python dependencies

pip3 install torch
pip3 install -e .
pip3 install pandas  \
     matplotlib \
     scipy \
     pyyaml
export PYTHONPATH=$(pwd):$PYTHONPATH

Install MKL 2021.4.0

wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB && \
    add-apt-repository -y "deb https://apt.repos.intel.com/oneapi all main" && \
    apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB && \
    rm GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB && \
    apt-get update -y && \
    apt-get install -y intel-oneapi-mkl-devel-2021.4.0

Generating executor/scheduler pairs (used in the paper)

from spmm_nano_kernels, run the following:

cd spmm_nano_kernels
python3 -m codegen.generate_ukernels
cd ..

Building the demo code

mkdir release-build
cmake -Brelease-build -DCMAKE_BUILD_TYPE=Release -DENABLE_AVX512=True .
make -j 16 -Crelease-build SPMM_demo

Benchmarking a matrix

python3 run_matrix.py -m ../dlmc/transformer/magnitude_pruning/0.8/body_decoder_layer_0_ffn_conv1_fully_connected.smtx -t 8 -b 512 -o results.csv -d

see:

python3 run_matrix.py --help

for more details

Code overview