gundam-organization / gundam

GUNDAM, for Generalized and Unified Neutrino Data Analysis Methods, is a suite of applications which aims at performing various statistical analysis with different purposes and setups.
GNU Lesser General Public License v2.1
13 stars 10 forks source link

LTS 1.8.x: Resolve Issue 530 -- GPU sum efficiency #533

Closed ClarkMcGrew closed 1 month ago

ClarkMcGrew commented 1 month ago

This is a "low hanging fruit" fix that doubles the speed of the likelihood calculation using the GPU. The fix is that the old histogram summing algorithm used atomic addition (not recommended on a GPU) since it's very simple. The new code applies a more standard, but more complex, algorithm to use interleaved sums. The raw algorithm is about x10 faster, but results in an overall doubling in the speed.

Note: The new code is still limited by the global memory bandwidth and doesn't make efficient use of the GPU blocks and warps. It could be optimized at the expense of complexity. However, since the sum is no longer a bottleneck this keeps the code simple.