celeritas-project / celeritas

Celeritas is a new Monte Carlo transport code designed to accelerate scientific discovery in high energy physics by improving detector simulation throughput and energy efficiency using GPUs.
https://celeritas-project.github.io/celeritas/user/index.html
Other
62 stars 33 forks source link

Add option to specify maximum number of substeps in field propagator #1236

Closed amandalund closed 4 months ago

amandalund commented 4 months ago

This allows the user to set the maximum number of field propagator substeps, which can have a significant impact on performance due to improved load balancing (see the plot below for the 32 ttbar CMS run3 results with three different max_substep values: 10, our default of 100, and Geant4's default of 1000).

I'm not really sure how best to safely access the user-specified max_substeps in the SimParams, which needs it to calculate the looping threshold values. In particular, through accel apps other than celer-g4 I'm not sure how to validate that the max_substeps in the setup options matches what's in the along-step factory. I'm also not entirely sure if changing this parameter has any consequence on the physics (msc?).

speedup-a100

sethrj commented 4 months ago

Also, good work with figuring out this speedup! I wonder if we should adjust the default parameter based on this finding. I want to see whether the spikes in the time-per-step plot for CMS 2018+field disappear with this... which we have always suspected is due to looping. time-per-step-cms2018+field+msc

amandalund commented 4 months ago

I think adding a propagation params is a good idea. It would definitely be interesting to see how changing the max substeps affects that plot... I also need to experiment with adjusting this parameter in some of our other test problems.

amandalund commented 4 months ago

@sethrj a different plot, but you can see how changing the max substeps affects the step times/variation in times. It definitely looks like those spikes are from looping tracks. field-maxsubsteps

sethrj commented 4 months ago

Whoa why does the maximum 10 cut off at 6000 but the 100 live much longer? Is it because we're clearing track slots for the looping tracks earlier?

amandalund commented 4 months ago

It might just be a statistical fluctuation (though I was surprised too that the number of step iterations was so much hgher than with 10 or 1000)... I tried running the same problem with a different seed and it only took 5900 steps.

amandalund commented 4 months ago

@sethrj a couple more observations:

total-step-iter

The last 20 steps of one of these tracks:

track 16419589, vol 1985, mat 406, particle 1, energy 3.9923808482957686e-02, step 6949, looping 0, step length 1.3858658842098418e-02
track 16419589, vol 2751, mat 381, particle 1, energy 3.9850266767637074e-02, step 6950, looping 0, step length 4.7244629498538651e-02
track 16419589, vol 1985, mat 406, particle 1, energy 3.9814529158514730e-02, step 6951, looping 0, step length 1.8629256656536680e-01
track 16419589, vol 2751, mat 381, particle 1, energy 3.8257980851032464e-02, step 6952, looping 0, step length 4.9934856957369343e-02
track 16419589, vol 1985, mat 406, particle 1, energy 3.7893562258098423e-02, step 6953, looping 0, step length 7.3445125203530637e-01
track 16419589, vol 2751, mat 381, particle 1, energy 3.1588001694221136e-02, step 6954, looping 0, step length 3.0593442567106464e-02
track 16419589, vol 1985, mat 406, particle 1, energy 3.1159181280340318e-02, step 6955, looping 0, step length 3.0731719856365525e-01
track 16419589, vol 2751, mat 381, particle 1, energy 2.8875578978385202e-02, step 6956, looping 0, step length 9.8504012780081442e-03
track 16419589, vol 1985, mat 406, particle 1, energy 2.8764161298133990e-02, step 6957, looping 0, step length 3.1567462632067628e-01
track 16419589, vol 2752, mat 381, particle 1, energy 2.6652691197892056e-02, step 6958, looping 0, step length 1.0000004250248819e-06
track 16419589, vol 2752, mat 381, particle 1, energy 2.6652682076712491e-02, step 6959, looping 0, step length 2.1192793848139056e-13
track 16419589, vol 1985, mat 406, particle 1, energy 2.6652682076710559e-02, step 6960, looping 0, step length 2.0257033488857515e-01
track 16419589, vol 2751, mat 381, particle 1, energy 2.3507798832861856e-02, step 6961, looping 0, step length 1.4031100172018471e-01
track 16419589, vol 2751, mat 381, particle 1, energy 2.2150049606347002e-02, step 6962, looping 0, step length 1.7791966459317970e-01
track 16419589, vol 2751, mat 381, particle 1, energy 2.0742165110699450e-02, step 6963, looping 0, step length 2.1894447090401701e-02
track 16419589, vol 2751, mat 381, particle 1, energy 1.8300777647732802e-02, step 6964, looping 0, step length 5.7666532505831737e-02
track 16419589, vol 2751, mat 381, particle 1, energy 1.6257806981944168e-02, step 6965, looping 0, step length 1.7311275991309827e-01
track 16419589, vol 2751, mat 381, particle 1, energy 1.3513392719547898e-02, step 6966, looping 0, step length 1.3732918277251066e-01
track 16419589, vol 2751, mat 381, particle 1, energy 9.6172212560103210e-03, step 6967, looping 0, step length 1.6222676351235218e-01
track 16419589, vol 2751, mat 381, particle 1, energy 3.9073150446839004e-03, step 6968, looping 0, step length 4.9231860431891476e-02
sethrj commented 4 months ago

Damn, it's looping between multiple volumes... maybe there's some heuristic with average step length/average energy loss per step that could be used...