celeritas-project / celeritas

Celeritas is a new Monte Carlo transport code designed to accelerate scientific discovery in high energy physics by improving detector simulation throughput and energy efficiency using GPUs.
https://celeritas-project.github.io/celeritas/user/index.html
Other
58 stars 32 forks source link

Make accel "auto flush" threshold configurable #1231

Closed amandalund closed 1 month ago

amandalund commented 1 month ago

Currently in our local transporter we offload tracks to Celeritas after num_track_slots have been buffered; however, it may be beneficial in some cases to buffer more than the number of available track slots before starting the stepping loop. This change lets the user specify the "auto flush" threshold (it will still default to the number of track slots). I haven't tested many configurations, but for example I saw a 15-25% performance improvement for the CMS run 3, 32 ttbar event problem on an A100 by decreasing the number of track slots from 2^19 to 2^16 but increasing the buffer size to 2^20: speedup-a100

esseivaju commented 1 month ago

@amandalund If you buffer more primaries than we have track slots, where is that overflow handled?

amandalund commented 1 month ago

The primaries are converted to track initializers and stored in that buffer while they are waiting to be transported, so as long as the "auto flush" threshold is less than the initializer capacity there shouldn't be any overflow.