Closed adw62 closed 2 years ago
Thanks for the snippet! It looks you're not using the alchemical factory here so I assume the NaN happens with the code in both master
and alchemical-regions
?
Yeah, for me the above snippet is failing in master
and alchemical-regions
.
I am too seeing NaN issues with test_alchemy.py:TestAbsoluteAlchemicalFactory.test_overlap using:
OpenMMtools = 0.20.3
OpenMM = 7.5.1
nvidia-driver kernel driver 460.73.01
Interestingly, it only occurs on a K40m, but not on a V100 and only on the CUDA platform.
The above example code is similar to TestAbsoluteAlchemicalFactory.test_overlap, but it does not carry out a minimisation, if you add one in:
from openmmtools import testsystems, forcefactories
from openmmtools.constants import kB
from simtk import openmm, unit
import copy
reference = testsystems.WaterBox(dispersion_correction=False, switch=False, nonbondedMethod=openmm.app.CutoffPeriodic)
reference_system = reference.system
positions = reference.positions
forcefactories.replace_reaction_field(reference_system, return_copy=False)
if __name__ == '__main__':
for name in ['CUDA', 'CPU', 'Reference']:
platform = openmm.Platform.getPlatformByName(name)
temperature = 300.0 * unit.kelvin
pressure = 1.0 * unit.atmospheres
collision_rate = 5.0 / unit.picoseconds
timestep = 2.0 * unit.femtoseconds
kT = kB * temperature
# Add a barostat if possible.
reference_system = copy.deepcopy(reference_system)
if reference_system.usesPeriodicBoundaryConditions():
reference_system.addForce(openmm.MonteCarloBarostat(pressure, temperature))
# Create integrators.
integrator = openmm.LangevinIntegrator(temperature, collision_rate, timestep)
# Create contexts.
reference_context = openmm.Context(reference_system, integrator, platform)
# Collect simulation data.
reference_context.setPositions(positions)
openmm.LocalEnergyMinimizer.minimize(reference_context, maxIterations=100)
try:
integrator.step(1000)
print('{} passed'.format(name))
except Exception as err:
print('{} failed with error: {}'.format(name, err))
no NaNs are seen.
Returning the original problem, I suspect the WaterBox test system needs more of a clean up after being created. I found setting maxIterations=500 in TestAbsoluteAlchemicalFactory.mimimize() fixes all the TestAbsoluteAlchemicalFactory.test_overlap on the K40m.
Also, I am not sure if the units of the tolerance are being correctly converted to OpenMM's internal units within that method.
@ijpulidos : Could you tackle this?
I think this can be closed; thanks again for all the help/interaction on this. (edit; typo)
Fixed in #534
Hi,
I'm still having issues with the NaNs for the TIP3P water box when doing test_overlap in TestAbsoluteAlchemicalFactory. I think the NaNs are related to replacing the reaction field as removing this step solves the issue. Also the issue does not seem to appear on the Reference platform, not sure what is going on here.
This is a minimal example:
Cheers, Alex