celeritas-project / celeritas

Celeritas is a new Monte Carlo transport code designed to accelerate scientific discovery in high energy physics by improving detector simulation throughput and energy efficiency using GPUs.
https://celeritas-project.github.io/celeritas/user/index.html
Other
63 stars 34 forks source link

Use geometry length scale to determine a minimum step #994

Open sethrj opened 1 year ago

sethrj commented 1 year ago

The single-precision support added in #988 fails in many cases because the step size is small enough that the floating point position does not change over the course of a step. Celeritas requires all steps to be strictly positive to avoid stuck tracks, and all positive steps need to change the physical geometry state. What's happening is that the step length is small compared to the spatial position, where the inequality $$\frac{s}{x} < \epsilon_\textrm{mach}$$ implies that with floating point arithmetic, $$\tilde x + \tilde s = \tilde x$$ i.e., no change in position takes place. One way to counter this is to have a minimum step size based on the position in the local reference frame. I can think of a few ways to make a lower bound for the current step:

This minimum step size should also be accounted for in the field substeppers, since the accumulated additions in those will also be susceptible to roundoff error based on the position.

sethrj commented 1 year ago

@whokion happy for any comments you have here 😄 @paulromano have you done any exploration of single-precision geometry?

whokion commented 1 year ago

Where are these small steps happening (i.e. are they from the physics stepping or only inside the field propagator as segments of a step)? I assume latter (otherwise, the linear propagator should experience the same problem and/or it is a general tracking problem). If the problem is only from the field propagator, the relative tolerance (as a function of some length scale) may not be a good choice as the field integration should be controlled by the relative error and a small step over the tolerance should not be segmented further. Should the tolerance of geometry serve as the cut for entering the field propagator (i.e., s < tolerance, should use the linear propagator)? MSC is another potential place, but again the small step limitation should serve the same role.

sethrj commented 1 year ago

@whokion The linear propagator is also seeing this (charged particles, no fields). I'm assuming it's due to the MSC step limiting or other range limits for low-energy particles. The field substepping is probably another issue on top of this 😅

paulromano commented 1 year ago

@sethrj I haven't personally tried doing geometry in single precision as it becomes a lot harder. @pshriwise may be able to comment on his experience as I believe he has tried doing some single and/or mixed precision geometry work

whokion commented 1 year ago

Okay. If MSC is the primary source of the issue, we may adjust UrbanMscParameters::limit_min_fix (currently, 1e-9 * units::centimeter) for the single precision mode and test how it goes first. For the small returning range limit from low energy particles, we should limit the step (so, locally deposit energy and kill them) based on the tracking cut (absolute length, but relative energy) and the lowest electron energy (use in the energy loss calculation) which can be easily configurable and tested. Anyway, we may need to categorize where those small steps come from and investigate from there rather than introducing another ad hoc parameter for the single precision mode.

sethrj commented 11 months ago

Interestingly when I ran CMS2018 with assertions (uniform field + msc) on I got an assertion failure on one of the two CPU wildstyle runs:

internal assertion failed: `track.make_geo_view().pos() != orig_pos` at `PropagationApplier.hh:116`

so we're even taking too-small steps for double precision to count. We definitely need some sort of minimum based on the local volume extents.

sethrj commented 11 months ago

Another thing we should totally do in the propagators is to use the local coordinate system. Instead of operating on the global position and rotate/translate "down", which loses accuracy if local universe is smaller than the global, we should do propagation in the local coordinate system since we know we won't cross boundaries or exit the cell. The global position can be updated at the end of the propagator call.

This should vastly reduce loss of accuracy in the substepper.