choderalab / openmmtools

A batteries-included toolkit for the GPU-accelerated OpenMM molecular simulation engine.
http://openmmtools.readthedocs.io
MIT License
250 stars 79 forks source link

test_overlap NaNs #435

Closed adw62 closed 2 years ago

adw62 commented 5 years ago

Hi,

I'm still having issues with the NaNs for the TIP3P water box when doing test_overlap in TestAbsoluteAlchemicalFactory. I think the NaNs are related to replacing the reaction field as removing this step solves the issue. Also the issue does not seem to appear on the Reference platform, not sure what is going on here.

This is a minimal example:

from openmmtools import testsystems, forcefactories
from openmmtools.constants import kB
from simtk import openmm, unit
import copy

reference = testsystems.WaterBox(dispersion_correction=False, switch=False, nonbondedMethod=openmm.app.CutoffPeriodic)
reference_system = reference.system
positions = reference.positions
forcefactories.replace_reaction_field(reference_system, return_copy=False)

if __name__ == '__main__':
    for name in ['CUDA', 'CPU', 'Reference']:
        platform = openmm.Platform.getPlatformByName(name)
        temperature = 300.0 * unit.kelvin
        pressure = 1.0 * unit.atmospheres
        collision_rate = 5.0 / unit.picoseconds
        timestep = 2.0 * unit.femtoseconds
        kT = kB * temperature

        # Add a barostat if possible.
        reference_system = copy.deepcopy(reference_system)
        if reference_system.usesPeriodicBoundaryConditions():
            reference_system.addForce(openmm.MonteCarloBarostat(pressure, temperature))

        # Create integrators.
        integrator = openmm.LangevinIntegrator(temperature, collision_rate, timestep)

        # Create contexts.
        reference_context = openmm.Context(reference_system, integrator, platform)

        # Collect simulation data.
        reference_context.setPositions(positions)
        try:
            integrator.step(1000)
            print('{} passed'.format(name))
        except Exception as err:
            print('{} failed with error: {}'.format(name, err))

Cheers, Alex

andrrizzi commented 5 years ago

Thanks for the snippet! It looks you're not using the alchemical factory here so I assume the NaN happens with the code in both master and alchemical-regions?

adw62 commented 5 years ago

Yeah, for me the above snippet is failing in master and alchemical-regions.

mjw99 commented 2 years ago

I am too seeing NaN issues with test_alchemy.py:TestAbsoluteAlchemicalFactory.test_overlap using:

OpenMMtools = 0.20.3 
OpenMM = 7.5.1
nvidia-driver kernel driver 460.73.01 

Interestingly, it only occurs on a K40m, but not on a V100 and only on the CUDA platform.

The above example code is similar to TestAbsoluteAlchemicalFactory.test_overlap, but it does not carry out a minimisation, if you add one in:

from openmmtools import testsystems, forcefactories
from openmmtools.constants import kB
from simtk import openmm, unit
import copy

reference = testsystems.WaterBox(dispersion_correction=False, switch=False, nonbondedMethod=openmm.app.CutoffPeriodic)
reference_system = reference.system
positions = reference.positions
forcefactories.replace_reaction_field(reference_system, return_copy=False)

if __name__ == '__main__':
    for name in ['CUDA', 'CPU', 'Reference']:
        platform = openmm.Platform.getPlatformByName(name)
        temperature = 300.0 * unit.kelvin
        pressure = 1.0 * unit.atmospheres
        collision_rate = 5.0 / unit.picoseconds
        timestep = 2.0 * unit.femtoseconds
        kT = kB * temperature

        # Add a barostat if possible.
        reference_system = copy.deepcopy(reference_system)
        if reference_system.usesPeriodicBoundaryConditions():
            reference_system.addForce(openmm.MonteCarloBarostat(pressure, temperature))

        # Create integrators.
        integrator = openmm.LangevinIntegrator(temperature, collision_rate, timestep)

        # Create contexts.
        reference_context = openmm.Context(reference_system, integrator, platform)

        # Collect simulation data.
        reference_context.setPositions(positions)
        openmm.LocalEnergyMinimizer.minimize(reference_context, maxIterations=100)
        try:
            integrator.step(1000)
            print('{} passed'.format(name))
        except Exception as err:
            print('{} failed with error: {}'.format(name, err))

no NaNs are seen.

Returning the original problem, I suspect the WaterBox test system needs more of a clean up after being created. I found setting maxIterations=500 in TestAbsoluteAlchemicalFactory.mimimize() fixes all the TestAbsoluteAlchemicalFactory.test_overlap on the K40m.

Also, I am not sure if the units of the tolerance are being correctly converted to OpenMM's internal units within that method.

jchodera commented 2 years ago

@ijpulidos : Could you tackle this?

mjw99 commented 2 years ago

I think this can be closed; thanks again for all the help/interaction on this. (edit; typo)

mikemhenry commented 2 years ago

Fixed in #534