PrincetonUniversity / SPECFEMPP

SPECFEM++ is a complete re-write of SPECFEM suite of packages (SPECFEM2D, SPECFEM3D, SPECFEM3D_GLOBE) using C++
https://specfem2d-kokkos.readthedocs.io/en/latest/
GNU General Public License v3.0
16 stars 9 forks source link

Update elemental kernel to use ThreadVectorLoops #90

Closed Rohit-Kakodkar closed 10 months ago

Rohit-Kakodkar commented 1 year ago

Currently we use ThreadThreadRange to distinguish elements inside elemental kernels. Limiting the number of threads per block we can use to 32 for 5 x 5 quadrature in 2D. A better way of doing this would be use ThreadVectorRange to distinguish elements - which would allow to use arbitary number of threads inside a block.

Rohit-Kakodkar commented 10 months ago

Final Commit for optimization

Summary of findings :