When does CUDA kernel reset forces to zero?

openmm / openmm

OpenMM is a toolkit for molecular simulation using high performance GPU code.

1.49k stars 523 forks source link

When does CUDA kernel reset forces to zero? #2256

Closed scychon closed 3 years ago

scychon commented 5 years ago

I'm writing a custom plugin to do a modified velocity-verlet integration with a new thermostat scheme. I wanted to optimize the number of force calculation so that it only calculate forces after position updates. I've also added recalculating forces before the first velocity move (beginning of execution) only for the following cases (1) barostat changes system box, (2) atoms are reordered, (3) state change event occurs (Integrator::stateChanged). Although I'm still seeing that the forces are occasionally reset to zero after some thousands of steps. Is there any other event that I should be aware of ? (To force the CUDA context to recalculate the forces?)

peastman commented 5 years ago

Those are the main cases. More generally, whenever you call context.updateContextState() you should check the return value. If it returns true, that means the forces are no longer valid and need to be recalculated. As long as nothing is calling updateContextState(), calcForcesAndEnergy(), or reorderAtoms(), I can't think of any reason the array of forces would get modified.

scychon commented 5 years ago

I think I found a bug with drude plugin. I was not checking the return value of updateContextState(), since it always returned true for me. It turns out that drude plugin has the deprecated format of "void updateContextState(ContextImpl& context)", which should be "void updateContextState(ContextImpl& context, bool& forcesInvalid)" to override the ForceImpl::updateContextState. After fixing this, I get proper behavior of updateContextState and now can recalculate forces only when there's changes in the state.

It might be good idea to check other plugins as well since they might have the old format as well.

peastman commented 5 years ago

Thanks, I should fix that. It isn't actually a bug though. ForceImpl provides a default implementation of updateContextState() with the new signature, which calls the old version and then sets forcesInvalid to true. So if a plugin implements the old version rather than the new one, everything still works correctly. We just lose the opportunity to do some optimizations, since the invoking code will always think the forces have been invalidated.

scychon commented 5 years ago

Thanks. While fixing the above mentioned gives proper return of updateContextState(), I still have issues with forces occasionally reset to zero. I found this happens after I call sub-forcegroup's energy on the python api, something like

context.getState(getEnergy=True, groups=2**j).getPotentialEnergy()

I'm not sure how I can let the plugin know of this event has happened on the api. Any ideas?

peastman commented 5 years ago

You can call context.getLastForceGroups() to see what set of force groups the forces were last evaluated for. If that doesn't match what you want, then recalculate them.

CustomIntegrator gets even fancier. It caches the forces in its own arrays so that it can restore them as needed. This makes a big difference for multiple time step integrators where you're switching back and forth between force groups. But for your purpose, that probably isn't needed.

scychon commented 5 years ago

Thanks ! context.getLastForceGroups() did the job ! It was sufficient for me to check whether the last forcegroup was all (-1) and I might not need cache the forces since this happens only when the reporters log each energy groups. (~ 1 per 1000 steps).