hpc4cmb / toast

Time Ordered Astrophysics Scalable Tools
Other
44 stars 39 forks source link

Redesigning OpenMP target offload loops #676

Closed nestordemeure closed 1 year ago

nestordemeure commented 1 year ago

Precomputing the maximum interval size to be able to fully collapse the triple loop in most operators leaded to significantly improved computation density on the GPU.