choderalab / openmmtools

A batteries-included toolkit for the GPU-accelerated OpenMM molecular simulation engine.
http://openmmtools.readthedocs.io
MIT License
250 stars 81 forks source link

Problems with Exponentially distributed temperatures #733

Open amin-sagar opened 4 months ago

amin-sagar commented 4 months ago

Hello. I am trying to run some Temperature replica exchange simulations for small molecules. The relevant part of my script is as follows.

# Set up the temperature range for replicas
n_replicas = 25
min_temp = 300 * unit.kelvin
max_temp = 400 * unit.kelvin
temperatures = get_temperature_list(min_temp,max_temp,n_replicas)
#temperatures = [min_temp + (max_temp - min_temp) * (math.exp(float(i) / float(n_replicas-1)) - 1.0) / (math.e - 1.0)  for i in range(n_replicas)]
print (temperatures)
#temperatures = [min_temp + i * (max_temp - min_temp) / (n_replicas - 1) for i in range(n_replicas)]

# Define simulation parameters
collision_rate = 1.0 / unit.picoseconds
timestep = 4 * unit.femtoseconds
nsteps = 200  # Number of steps for each iteration of replica exchange
n_iterations = 10000  # Number of iterations for the replica exchange

# Create thermodynamic states and integrators for each replica
thermodynamic_states = []
samplers = []
for temp in temperatures:
    integrator = LangevinMiddleIntegrator(temp, collision_rate, timestep)
    thermodynamic_state = states.ThermodynamicState(system=system, temperature=temp)
    thermodynamic_states.append(thermodynamic_state)
    sampler_state = states.SamplerState(modeller.positions, box_vectors=modeller.topology.getPeriodicBoxVectors())
    samplers.append(mcmc.MCMCSampler(thermodynamic_state, sampler_state, integrator))

# Set up the replica exchange simulation
output_directory = "./output_7"
storage_file = f'{output_directory}/repex.nc'
reporter = MultiStateReporter(storage=storage_file, checkpoint_interval=200)
simulation = multistate.ReplicaExchangeSampler(mcmc_moves=mcmc.LangevinDynamicsMove(timestep=timestep, collision_rate=collision_rate, n_steps=nsteps),
                                               number_of_iterations=n_iterations)
simulation.create(thermodynamic_states=thermodynamic_states, sampler_states=[sampler.sampler_state for sampler in samplers], storage=reporter)
#Minimize
simulation.minimize()

# Run the replica exchange simulation
simulation.run()

The function to create temperatures is from cg_openmm repo

def get_temperature_list(min_temp, max_temp, num_replicas):
    """
        Given the parameters to define a temperature range as input, this function uses logarithmic spacing to generate a list of intermediate temperatures.

        :param min_temp: The minimum temperature in the temperature list.
        :type min_temp: `Quantity() <http://docs.openmm.org/development/api-python/generated/simtk.unit.quantity.Quantity.html>`_

        :param max_temp: The maximum temperature in the temperature list.
        :type max_temp: `Quantity() <http://docs.openmm.org/development/api-python/generated/simtk.unit.quantity.Quantity.html>`_

        :param num_replicas: The number of temperatures in the list.
        :type num_replicas: int

        :returns:
           - temperature_list ( 1D numpy array ( float * simtk.unit.temperature ) ) - List of temperatures

    """

    T_unit = min_temp.unit

    temperature_list = np.logspace(
        np.log10(min_temp.value_in_unit(T_unit)),
        np.log10(max_temp.value_in_unit(T_unit)),
        num=num_replicas
        )

    # Reassign units:
    temperature_list *= T_unit

    return temperature_list

On running the simulations, I get many lines like the following

Failed to reach a solution to within tolerance with hybr: trying next method
WARNING: Did not converge to within specified tolerance.
max_delta = 0.000000e+00, tol = 1.000000e-12, maximum_iterations = 10000, iterations completed = 9999
Failed to reach a solution to within tolerance with adaptive: trying next method
No solution found to within tolerance.
The solution with the smallest gradient 2.400500e+02 norm is hybr
Please exercise caution with this solution and consider alternative methods or a different tolerance.
Warning: The openmmtools.multistate API is experimental and may change in future releases
Failed to reach a solution to within tolerance with hybr: trying next method
WARNING: Did not converge to within specified tolerance.
max_delta = 0.000000e+00, tol = 1.000000e-12, maximum_iterations = 10000, iterations completed = 9999
Failed to reach a solution to within tolerance with adaptive: trying next method
No solution found to within tolerance.

However, if I use a linear temperature ladder

temperatures = [min_temp + i * (max_temp - min_temp) / (n_replicas - 1) for i in range(n_replicas)]

I don't get any such errors and the simulation finishes as expected. Also, with linear scaling I get no errors with 10 temperature points in the same temperature range while with exponentially distributed temperatures, I see these errors with even 25 temperature points.

I am quite sure I am doing something wrong but I can't figure it out. I will be really grateful for any suggestions.

Best, Amin.

ijpulidos commented 4 months ago

Thanks for the detailed report. I can tell that the warning message is coming from pymbar. What version are you using of pymbar in your environment? We have seen some instabilities with pymbar 4 and the jax backend to do the estimates (see for example in https://github.com/choderalab/pymbar/issues/419#issuecomment-1955191199 ). Maybe using the robust solver or downgrading to pymbar 3 could do the trick, if that makes sense.

amin-sagar commented 4 months ago

Thanks @ijpulidos I am using

pymbar                    4.0.3                hff52083_1    conda-forge
pymbar-core               4.0.3           py310h1f7b6fc_1    conda-forge

I was looking at the tutorial and code and I am sorry I couldn't find how to specify the robust kwargs in the simulations script. Do I need to modif the [openmmtools/openmmtools/multistate file?

I have noticed that this seems to happen if I have too many replicas, for example n_replicas = 25 min_Temp = 300 max_Temp = 400

doesn't work but

n_replicas = 22 min_Temp = 300 max_Temp = 600

works.

I would be really grateful if you can help me in including the robust analysis protocol in the simulation script.