cp2k / dbcsr

DBCSR: Distributed Block Compressed Sparse Row matrix library
https://cp2k.github.io/dbcsr/
GNU General Public License v2.0
135 stars 47 forks source link

OpenMP/offload backend #214

Open hfp opened 5 years ago

hfp commented 5 years ago

OpenMP 4.0 introduced offloading to attached devices and made extensions in OpenMP 4.5 (future versions of OpenMP will likely only evolve the offload capability with more advanced features on top of a mature baseline). It can be considered to implement an OpenMP-based backend which is able to offload to GPUs (and potentially other PCI-attached devices). This would be applicable to Fortran and C/C++ (even for the code within the offload-region). However since the ACC-interface is already C-based, an even wider set of working toolchains is already working: LLVM/Clang w/ NV offload plugin can be built from source (tested on x86 w/ NVidia GPU) and IBM XL compilers support OpenMP 4.5 and work right away for NVidia GPUs.

hfp commented 5 years ago

https://github.com/hfp/dbcsr/tree/develop/src/acc/openmp (work in progress)

hfp commented 4 years ago

PR #260 may be reviewed. Limitations are left in the PR's decription. Limitations are worked out preferably using subsequent PRs.