paboyle / Grid

Data parallel C++ mathematical object library
GNU General Public License v2.0
155 stars 111 forks source link

Compact Exponential Cloverterm on GPU #414

Closed fjosw closed 2 years ago

fjosw commented 2 years ago

This patch speeds up the construction of the exponential Clover term in the compact layout on GPU architectures. The exponentiation is now performed on the accelerator and the inverse is obtained by computing $\exp(-Clover)$ instead of performing an explicit matrix inversion. In all test cases I looked at the constructor is now dominated by calls to fillCloverYZ, etc.

For the standard clover term and for the exponential clover term with non-periodic boundary condition the inverse is still computed explicitly on the CPU. We could consider using Eigen on the GPU for this operation as discussed yesterday but I did not look into the compatibility of Eigen 3.4 and various CUDA versions, yet.

paboyle commented 2 years ago

looks clean