Benchmark_SpMV_using_CSR5

Introduction

This is the source code of the paper

Weifeng Liu and Brian Vinter, "CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication". In Proceedings of the 29th ACM international conference on Supercomputing (ICS '15), pp.339-350, 2015. [pdf][slides]

Contact: Weifeng Liu and Brian Vinter (vinter at nbi.ku.dk).

Updates:
   (Jan 2017, avx512 and opencl): added two versions: AVX-512 for Knights Landing Phi (KNL) and OpenCL for nVidia GPUs.
   (Jul 2016, phi): fixed the same two issues in the original AVX2 version. Thanks Jan Philipp Ecker!
   (Jul 2016, avx2): Improved performance of y-vector update. Thanks Jan Philipp Ecker!
   (Jul 2016, avx2): Fixed a bug in processing small matrices. Thanks Jan Philipp Ecker!
   (Apr 2016, cuda): Fixed a bug in timing. Thanks Shigang Li!

CPU (AVX2) version

Execution

Set environments for the Intel C/C++ Compilers. For example, use source /opt/intel/composer_xe_2015.1.133/bin/compilervars.sh intel64,
Run make,
Run ./spmv example.mtx.

Tested environments

Intel Core i7-4770R CPU with Ubuntu 14.04 64-bit Linux installed.
Intel Xeon E5-2667 v3 dual-socket CPUs with Redhat 6.5 64-bit Linux installed.

Data type

Currently, only 64-bit double precision SpMV is supported.

Intel Xeon Knights Landing Phi (KNL) AVX-512 version

Execution

Set environments for the Intel C/C++ Compilers. For example, use source /opt/intel/composer_xe_2015.1.133/bin/compilervars.sh intel64,
Run make,
Run ./spmv example.mtx.

Tested environments

Intel Xeon Kinghts Landing Phi (KNL) 7210 in a host with CentOS 7.2 64-bit Linux installed.

Data type

Currently, only 64-bit double precision SpMV is supported.

nVidia GPU (CUDA) version

Execution

Set CUDA path in the Makefile,
Run make,
Run ./spmv example.mtx.

Tested environments

nVidia GeForce GTX 980 GPU in a host with Ubuntu 14.04 64-bit Linux installed.
nVidia GeForce GT 650M GPU in a host with Mac OS X 10.9.2 installed.

Data type

The code supports both double precision and single precision SpMV. Use make VALUE_TYPE=double for double precision or make VALUE_TYPE=float for single precision.

AMD GPU (OpenCL) version

Execution

Set OpenCL path in the Makefile,
Run make,
Run ./spmv example.mtx.

Tested environments

AMD Radeon R9-290X GPU in a host with Ubuntu 14.04 64-bit Linux installed.

Data type

The code supports both double precision and single precision SpMV. Use make VALUE_TYPE=double for double precision or make VALUE_TYPE=float for single precision.

nVidia GPU (OpenCL) version

Execution

Set OpenCL path in the Makefile,
Run make,
Run ./spmv example.mtx.

Tested environments

nVidia Pascal GTX 1060 GPU in a host with Ubuntu 15.04 64-bit Linux installed.

Data type

The code supports both double precision and single precision SpMV. Use make VALUE_TYPE=double for double precision or make VALUE_TYPE=float for single precision.

Intel Xeon Phi (KNC) version

Execution

Set environments for the Intel C/C++ Compilers. For example, use source /opt/intel/composer_xe_2015.1.133/bin/compilervars.sh intel64,
Run make,
Run ./spmv example.mtx.

Tested environments

Intel Xeon Phi 5110p in a host with Redhat 6.5 64-bit Linux installed.

Data type

Currently, only 64-bit double precision SpMV is supported.

weifengliu-ssslab / Benchmark_SpMV_using_CSR5

readme