PASSIONLab / CombBLAS

The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear algebra primitives specifically targeting graph analytics.
Other
64 stars 22 forks source link

Copyright

Combinatorial BLAS, Copyright (c) 2020, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy) and University of California, Santa Barbara. All rights reserved.

If you have questions about your rights to use or distribute this software, please contact Berkeley Lab's Innovation & Partnerships Office.

NOTICE. This Software was developed under funding from the U.S. Department of Energy and the U.S. Government consequently retains certain rights. As such, the U.S. Government has been granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, distribute copies to the public, prepare derivative works, and perform publicly and display publicly, and to permit other to do so.

This material is based upon work supported by the National Science Foundation under Grant No. 0709385 and by the Department of Energy, Office of Science, ASCR Contract No. DE-AC02-05CH11231. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF) and the Department of Energy (DOE). This software is released under the following license.

Introduction

The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear algebra primitives specifically targeting graph analytics. This repo has the code that represents version 2.0 of the software.

Download

If running on a Mac, we recommend using gcc compilers instead of clang (which has issues with OpenMP). For that, all you need to do is to replace step (4) above with

Test inputs are separately downloadable here. Extract them inside the _build directory you've just created with the command "tar -xzvf testdata_combblas1.6.1.tgz"

Requirements: You need a recent C++ compiler (gcc version 4.8+, Intel version 15.0+ and compatible), a compliant MPI implementation, and C++11 Standard library (libstdc++ that comes with g++ has them). The recommended tarball uses the CMake build system, but only to build the documentation and unit-tests, and to automate installation. The chances are that you're not going to use any of our sample applications "as-is", so you can just modify them or imitate their structure to write your own application by just using the header files. There are very few binary libraries to link to, and no configured header files. Like many high-performance C++ libraries, the Combinatorial BLAS is mostly templated.

Documentation: This is a beta implementation of the Combinatorial BLAS Library in written in C++ with MPI and OpenMP for parallelism. It is purposefully designed for distributed memory platforms though it also runs in uniprocessor and shared-memory (such as multicores) platforms. It contains efficient implementations of novel data structures/algorithms as well as reimplementations of some previously known data structures/algorithms for convenience. More details can be found in the accompanying paper [1]. One of the distinguishing features of the Combinatorial BLAS is its decoupling of parallel logic from the sequential parts of the computation, making it possible to implement new formats and plug them in without changing the rest of the library.

For I/O purposes, the implementation supports both a tuples format very similar to the Matrix Market and the Matrix Market format itself. We recommend using the Matrix Market version and associated ParallelReadMM() functions. We encourage in-memory generators for faster benchmarking.

The main data structure is a distributed sparse matrix ( SpParMat <IT,NT,DER> ) which HAS-A sequential sparse matrix ( SpMat <IT,NT> ) that can be implemented in various ways as long as it supports the interface of the base class (currently: SpTuples, SpCCols, SpDCCols).

For example, the standard way to declare a parallel sparse matrix A that uses 32-bit integers for indices, floats for numerical values (nonzeros), SpDCCols <int,float> for the underlying sequential matrix operations is:

Sparse and dense vectors are distributed along all processors. This is very space efficient and provides good load balance for SpMSV (sparse matrix-sparse vector multiplication).

New since version 1.6:

New in version 1.6:

New in version 1.5:

New in version 1.4:

The supported operations (a growing list) are:

All the binary operations can be performed on matrices with different numerical value representations. The type-traits mechanism will take care of the automatic type promotion, and automatic MPI data type determination.

Some features it uses:

Important Sequential classes:

Important Parallel classes:

Applications implemented using Combinatorial BLAS:

Performance results of the first two applications can be found in the design paper [1]; Graph 500 results are in a recent BFS paper [5]. The sparse matrix indexing, assignment, and multiplication results using 2D algorithms can be found in [6]. Performance of filtered graph algorithms (BFS and MIS) are reported in [8]. Performance of the 3D SpGEMM algorithm can be found in [10]

A subset of test programs demonstrating how to use the library (under ReleaseTests):

Citation: Please cite the current CombBLAS 2.0 paper [2] if you end up using the Combinatorial BLAS in your research in 2021 or later. If you are simply referencing the design principles and the concepts behind CombBLAS, then you can either cite [1] or [2], depending on the concept.