choderalab / openmmtools

A batteries-included toolkit for the GPU-accelerated OpenMM molecular simulation engine.
http://openmmtools.readthedocs.io
MIT License
244 stars 76 forks source link

Difference between ParallelTempering and ReplicaExchangeSampler? #734

Open amin-sagar opened 2 months ago

amin-sagar commented 2 months ago

Hello Everyone. I tried setting up Temperature REMD simulations using two scripts. The first one is using ReplicaExchangeSampler

# Set up the temperature range for replicas
n_replicas = 20
min_temp = 300 * unit.kelvin
max_temp = 600 * unit.kelvin
temperatures = get_temperature_list(min_temp,max_temp,n_replicas)
print (temperatures)

# Define simulation parameters
collision_rate = 1.0 / unit.picoseconds
timestep = 4 * unit.femtoseconds
nsteps = 100  # Number of steps for each iteration of replica exchange
n_iterations = 5000  # Number of iterations for the replica exchange

# Create thermodynamic states and integrators for each replica
thermodynamic_states = []
samplers = []
for temp in temperatures:
    integrator = LangevinMiddleIntegrator(temp, collision_rate, timestep)
    thermodynamic_state = states.ThermodynamicState(system=system, temperature=temp)
    thermodynamic_states.append(thermodynamic_state)
    sampler_state = states.SamplerState(modeller.positions, box_vectors=modeller.topology.getPeriodicBoxVectors())
    samplers.append(mcmc.MCMCSampler(thermodynamic_state, sampler_state, integrator))

# Set up the replica exchange simulation
output_directory = "./output_test_2"
storage_file = f'{output_directory}/repex.nc'

reporter = MultiStateReporter(storage=storage_file, checkpoint_interval=200)
simulation = multistate.ReplicaExchangeSampler(mcmc_moves=mcmc.LangevinDynamicsMove(timestep=timestep, collision_rate=collision_rate, n_steps=nsteps),number_of_iterations=n_iterations)
simulation.create(thermodynamic_states=thermodynamic_states, sampler_states=[sampler.sampler_state for sampler in samplers], storage=reporter)
#Minimize
simulation.minimize()

# Run the replica exchange simulation
simulation.run()

The second one is using Parallel Tempering

reference_state = states.ThermodynamicState(system=system, temperature=min_temp)

move = mcmc.GHMCMove(timestep=4.0*unit.femtoseconds, n_steps=100)
simulation = ParallelTemperingSampler(mcmc_moves=move, number_of_iterations=5000)

output_directory = "./output_test_2_PT"
storage_path = f'{output_directory}/repex.nc'
reporter = MultiStateReporter(storage_path, checkpoint_interval=50)
simulation.create(reference_state,
                  states.SamplerState(modeller.positions,box_vectors=modeller.topology.getPeriodicBoxVectors()),
                  reporter, min_temperature=min_temp,
                  max_temperature=max_temp, n_temperatures=n_replicas)

simulation.minimize()
simulation.run()

The speeds from the two approaches are very different. With the first one (ReplicaExchangeSampler), I get

- iteration: 200
  percent_complete: 4.0
  mbar_analysis:
    free_energy_in_kT: 14027.979550649832
    standard_error_in_kT: 2.7343672189224604
    number_of_uncorrelated_samples: 38.440345764160156
    n_equilibrium_iterations: 113
    statistical_inefficiency: 2.3152756690979004
  timing_data:
    iteration_seconds: 0.6556503772735596
    average_seconds_per_iteration: 0.6574437845891444
    estimated_time_remaining: '0:52:36.387610'
    estimated_localtime_finish_date: 2024-Jun-27-12:10:54
    estimated_total_time: '0:54:47.218923'
    ns_per_day: 1051.344641476763

While with parallel tempering, I get

- iteration: 200
  percent_complete: 4.0
  mbar_analysis:
    free_energy_in_kT: 17299.512525427737
    standard_error_in_kT: 15.82197375264733
    number_of_uncorrelated_samples: 199.0
    n_equilibrium_iterations: 3
    statistical_inefficiency: 1.0
  timing_data:
    iteration_seconds: 1.767251968383789
    average_seconds_per_iteration: 1.7719121003270748
    estimated_time_remaining: '2:21:46.949994'
    estimated_localtime_finish_date: 2024-Jun-27-18:07:35
    estimated_total_time: '2:27:39.560502'
    ns_per_day: 390.08707027420405

Is this expected because of the difference of the kind of MC moves? Or am I doing something wrong which makes the results not comparable?

Best, Amin.

amin-sagar commented 3 weeks ago

Hello Everyone. Can someone help me out here please? Am I doing something wrong here or is this difference expected? Best, Amin.

ijpulidos commented 3 weeks ago

@amin-sagar you were right with your initial suspicion. The MC moves are different and they use different integrators, this probably explains the difference in performance. I'd suggest trying to use the same MC moves and integrators.