celeritas-project / celeritas

Celeritas is a new Monte Carlo transport code designed to accelerate scientific discovery in high energy physics by improving detector simulation throughput and energy efficiency using GPUs.
https://celeritas-project.github.io/celeritas/user/index.html
Other
58 stars 32 forks source link

Improve physics memory utilization and performance #1292

Open sethrj opened 1 week ago

sethrj commented 1 week ago

In an earlier iteration of Celeritas we pushed all physics "interactions" to a single vector (one per track) and then applied them all simultaneously. We saw slightly worse performance, but easier logic, when we changed the code so that the InteractionApplier updates the track.

With @esseivaju 's async allocators I think we should consider revisiting this by asynchronously allocating space for secondaries and interactions between the pre-post and post steps, having a post-post kernel update all the tracks with their interaction at once, and deallocate the buffers after after. This would also slightly improve the logic in the PreStepExecutor, which requires launching on all threads to reset the secondary initializer count. I think it should also improve kernel occupancy (and reduce code size) for the model kernels.