choderalab / openmmtools

A batteries-included toolkit for the GPU-accelerated OpenMM molecular simulation engine.
http://openmmtools.readthedocs.io
MIT License
244 stars 76 forks source link

Ensemble validation issue for temperature replica exchange #480

Open cwalker7 opened 4 years ago

cwalker7 commented 4 years ago

I am running temperature replica exchange simulations on small coarse-grained oligomer systems (containing a single chain), and have been using an ensemble validation check on the energies for thermodynamic state pairs. This is from the physical-validation python package (https://physical-validation.readthedocs.io/en/latest/userguide.html#ensemble-validation). I am finding that for a system with 12 replicas spaced logarithmically from 200K to 300K, using LangevinDynamicsMove with replica_mixing_scheme='swap-all', n_steps=100, collision frequency=1/ps, and time_step=10fs, the ratio ln(P(U)_1/P(U)_2) is consistently about 4-5 standard deviations off from the analytical value. The physical validation code uses the pymbar DetectEquilibration function to trim the energies for each state.

I don't think it's an issue of the time step being too large, as reducing to 1fs (with n_steps set to 1000) still leads to 4-5 standard deviations for the above system.

I've tried a number of different replica exchange parameter variations: Upon increasing n_steps to 1000, increasing the friction to 10/ps, or setting the mixing scheme to 'swap-neighbor', there is much better agreement (~1 standard deviation or less from analytical value). Those are comparable to the results for replica_mixing_scheme=None (no exchanges), and for independent simulations run in openmm.

My question is - is this a matter of simply not equilibrating each replica long enough after an exchange, or could there be an underlying issue somewhere with the exchanges?

These plots summarize the results, where quantiles is standard deviation from the analytical ratio: ensemble_check_langevinmove_coll_rate1_vs_time_step ensemble_check_langevinmove_step10fs_collision1_vs_exch_freq ensemble_check_langevinmove_step10fs_exch_freq1k_vs_coll_rate ensemble_check_langevinmove_step10fs_exch_freq100_coll_rate1_vs_mixing_scheme ensemble_check_langevinmove_step10fs_exch_freq100_vs_coll_rate

cwalker7 commented 4 years ago

My code for running the simulations is something like this:

`import numpy as np
from simtk import unit 
from openmmtools.multistate import MultiStateReporter, MultiStateSampler, ReplicaExchangeSampler
from openmmtools.multistate import ReplicaExchangeAnalyzer

total_simulation_time=20.0*unit.nanosecond
simulation_time_step=10*unit.femtosecond
exchange_frequency=100

simulation_steps = int(np.floor(total_simulation_time / simulation_time_step))
exchange_attempts = int(np.floor(simulation_steps / exchange_frequency))

num_replicas = len(temperature_list)
sampler_states = list()
thermodynamic_states = list()    

for temperature in temperature_list:
    thermodynamic_state = openmmtools.states.ThermodynamicState(
        system=system, temperature=temperature
    )
    thermodynamic_states.append(thermodynamic_state)
    sampler_states.append(
        openmmtools.states.SamplerState(positions)
    )

move = openmmtools.mcmc.LangevinDynamicsMove(
    timestep=simulation_time_step,
    collision_rate=1.0/unit.picosecond,
    n_steps=exchange_frequency,
    reassign_velocities=False,
)

simulation = ReplicaExchangeSampler(mcmc_moves=move, number_of_iterations=exchange_attempts)

reporter = MultiStateReporter("output.nc", checkpoint_interval=1)
simulation.create(thermodynamic_states, sampler_states, reporter)

simulation.minimize()
simulation.run()

analyzer = ReplicaExchangeAnalyzer(reporter)

(
    replica_energies,
    unsampled_state_energies,
    neighborhoods,
    replica_state_indices,
) = analyzer.read_energies()

# Convert replica_energies to state_energies
n_particles = np.shape(reporter.read_sampler_states(iteration=0)[0].positions)[0]
T_unit = temperature_list[0].unit
temps = np.array([temp.value_in_unit(T_unit) for temp in temperature_list])
beta_k = 1 / (kB.value_in_unit(unit.kilojoule_per_mole/T_unit) * temps)
n_replicas = len(temperature_list)
for k in range(n_replicas):
    replica_energies[:, k, :] *= beta_k[k] ** (-1)

total_steps = len(replica_energies[0][0])
state_energies = np.zeros([n_replicas, total_steps])

for step in range(total_steps):
    for state in range(n_replicas):
        state_energies[state, step] = replica_energies[
            np.where(replica_state_indices[:, step] == state)[0], 0, step
        ]

# Physical validation is run on 'state_energies' for adjacent temperature pairs`
mrshirts commented 4 years ago

One comment: in theory, if I recall correctly, it should not be necessary to equilibrate after a replica switch- it should be equilibrated automatically if the swap is accepted.

mrshirts commented 4 years ago

@jchodera @andrrizzi @hannahbrucemacdonald any thoughts on this?

jchodera commented 4 years ago

Apologies for the long delay in responding to these excellent questions---we're totally bandwidth-saturated in supporting the COVID Moonshot. While we may not be able to look into this ourselves in the short term, we are about to put out a job ad to start recruiting some more software scientists to help maintain our infrastructure and field issues like this, so things will improve in the next couple of months.

It's important to note that LangevinDynamicsMove uses the openmm.LangevinIntegrator, a leapfrog form of VVVR. By contrast LangevinSplittingDynamicsMove uses BAOAB in the velocity Verlet form (positions and velocities are synchronized). We very much recommend LangevinSplittingDynamicsMove if you want to ensure accurate statistics, or even better, one of the Metropolized variants.

It's been a while since I've looked at the checkensemble stuff, but if it is either (1) using velocities or instantaneous temperature, or (2) sensitive to very small-scale configuration space density errors (such as harmonic bond length distributions), then it is likely the leapfrog VVVR LangevinDynamicsMove will perform poorly. As a quick test, I'd suggest trying LangevinSplittingDynamicsMove and its Metropolized form, GHMCMove, to see if that reports improvements.

For the record, I don't believe checkensemble is an appropriate way of measuring deviation from the desired configuration space density, so we do not intend to investigate this in detail based on that measure alone. The unit tests use batteries of tests based on deviations from expectations and free energies for harmonic oscillators, and we have developed more sensitive ways to measure deviation of integrators from the desired configuration or phase space distributions.

mrshirts commented 4 years ago

I totally understand the issues with support here. We will try LangevinSplittingDynamicsMove (or @cwalker7 , have you already tried it? I think you tried GHMCMove?

For the record, I don't believe checkensemble is an appropriate way of measuring deviation from the desired configuration space density

Can you elaborate? I mean, it literally checks whether the ratio of the sampled distributions at different temperatures are consistent with the Boltzmann distribution. If it doesn't pass this check, then the statistical mechanics are, by definition, wrong. And it works with all systems, with any size (the paper you linked seemed to have issues with a n=500 water box). and just requires lists of energies/enthalpies. Harmonic oscillator tests by definition omit anharmonic contributions, which could be problematic when working with integrators with quadratic errors. And checkensemble works with NPT as well as NVT. I definitely appreciate that the other checks are more sensitive for configuration space distributions, and can therefore pick up other more subtle errors (i.e. ergodicity), but not obeying the Boltzmann distribution is not a great feature.

using velocities or instantaneous temperature

It is not using instantaneous temperature, that is a concept that is not really applicable to statistical mechanics! The only real temperature is the specified thermostat temperature of the bath, which is what is used. @cwalker7, can you double-check and see if these tests are using the total energy or the potential energy? Some of the integrators push error into the velocities, so it would be better to use quantities that only test the configurations. I mean, both should be satisfied, but configuration is more important.

(2) sensitive to very small-scale configuration space density errors

It's a good question exactly which configuration space density errors might be responsible; we don't have a great sense. We have relatively soft harmonic bonds (I think 1000 kJ/nm^2 for the coarse-grained models systems tested here - @cwalker7 can you confirm?). However, the effect is independent of time step over a large range (10 fs to 1 fs), so I doubt it has to do with harmonic bond length distributions, as those errors should be quite highly dependent on the time step.

Note that the errors seem to occur primarily with short exchange intervals, and that just running MD without exchanges works pretty much fine, so it does appear to have something to do with the exchanges (and any associated re-randomization?) itself, not the integrator per se.

@cwalker7, maybe you can do the "no exchange" simulations with the integrators discussed above - I suspect they will all pass the tests (maybe some slightly better than others), but the exchange simulations will all have the same problems. Would be good to see!

jchodera commented 4 years ago

Note that the errors seem to occur primarily with short exchange intervals, and that just running MD without exchanges works pretty much fine,

Interesting. Have you compared MultiStateSampler (same infrastructure, no exchanges) with ReplicaExchangeSampler? I'd also compare replica_mixing_scheme="swap-all" with "swap-neighbors" with None in ReplicaExchangeSampler.

Finally, are you running replica exchange serially or in parallel via MPI? We know there's a weird bug in exchanges that seems to be due to some weird mpi4py issue, but we never have been able to track it down: https://github.com/choderalab/openmmtools/issues/449

Please do make sure you're using potential energies (not total energies) and try the BAOAB version.

Can you elaborate? I mean, it literally checks whether the ratio of the sampled distributions at different temperatures are consistent with the Boltzmann distribution. If it doesn't pass this check, then the statistical mechanics are, by definition, wrong.

Sure, but there's no meaningful way to choose the threshold for this "check". It's not a binary thing where numbers below the threshold are "right" and numbers above the threshold are "wrong". Unless I missed a statistically meaningful approach to deciding this threshold somewhere?

mrshirts commented 4 years ago

Interesting. Have you compared MultiStateSampler (same infrastructure, no exchanges) with ReplicaExchangeSampler? I'd also compare replica_mixing_scheme="swap-all" with "swap-neighbors" with None in ReplicaExchangeSampler. Finally, are you running replica exchange serially or in parallel via MPI? We know there's a weird bug in exchanges that seems to be due to some weird mpi4py issue, but we never have been able to track it down:

449

@cwalker7 has done some of these things, but not all - @cwalker7 can you see if you can pull together the information on this?

Sure, but there's no meaningful way to choose the threshold for this "check". It's not a binary thing where numbers below the threshold are "right" and numbers above the threshold are "wrong". Unless I missed a statistically meaningful approach to deciding this threshold somewhere?

Ah, I see. So, right now our approach to assume normal error in correct simulations (which seems to be true - we should quantify this more) and look at the chance that deviations are due to random error, as there always will be some statistical deviation for any finite simulation. We then look for things that are off by too many standard deviations. Being consistently 5 standard deviations off is certainly not good.

To do this better, we could certainly do repeated samples and get better statistics. It DOES make it hard to automate for a single yes/no, since a single "off by 2.5 standard deviations" once is probably just a fluctuation, off by 2.5 standard deviations every single time indicates a subtle bug. So not so great for a unit or regression test, since if you set the cutoff too high, then you miss a lot of problems, and if you set it too low, it gets triggered far too frequently. Would be interested to hear thoughts about modifications to this.

cwalker7 commented 4 years ago

Ok thank you very much for the comments! I will work on these additional tests today - so far I have tried the GHMCMove, and there is only marginal improvement.

Interesting. Have you compared MultiStateSampler (same infrastructure, no exchanges) with ReplicaExchangeSampler? I'd also compare replica_mixing_scheme="swap-all" with "swap-neighbors" with None in ReplicaExchangeSampler.

I haven't tried MultiStateSampler yet. One of the above plots shows 'swap-all' vs. 'swap-neighbors' vs None for ReplicaExchangeSampler, and it seems that only 'swap-all' has the issue. I can repeat this test for MultiStateSampler.

Finally, are you running replica exchange serially or in parallel via MPI? We know there's a weird bug in exchanges that seems to be due to some weird mpi4py issue, but we never have been able to track it down:

For these tests I've been running on a single GPU with openmm platform set to CUDA. If I recall, we had the same issue for multiple CPU with MPI, but I haven't tried on a single CPU.

Please do make sure you're using potential energies (not total energies) and try the BAOAB version.

I can confirm that we are using the potential energies in the physical validation (from ReplicaExchangeAnalyzer.read_energies)

We have relatively soft harmonic bonds (I think 1000 kJ/nm^2 for the coarse-grained models systems tested here - @cwalker7 can you confirm?)

Yep - that is correct.

jchodera commented 4 years ago

Ok, this suggests something is going on with the compiled swap-all code. Let me take a look

mrshirts commented 3 years ago

and it seems that only 'swap-all' has the issue.

This is great to know. If the temperature spacing is relatively broad, then "swap-neighbors" is not that much worse than "swap-all". So at minimum we can use this approach (though "swap-neighbors" would be good to use).

jchodera commented 3 years ago

@mrshirts can you also take a look at the swap-all code and see if you spot any logic errors? I presume it must either be a messed up array representation or weird compilation issue

mrshirts commented 3 years ago

Basically looking over the code in _mix_replicas in multistate/replicaechange.py (and what gets called there), correct? Will look over it. Since both use _attempt_swap, my GUESS is that it's something to do with repeated swaps and the state of the system remaining partially in the pre-swap state - it LOOKS correct on first glance, but if it didn't look correct on first glance it probably would have been caught already. I'll keep digging. If we can localize the problem just here, that would be very good.

I wonder it has to do with the fact that the reduced energies may change for a state if they are now "owned" by a different temperature - the energy remains the same, but the reduced energy changes (since it is supposed to have been generated by a different temperature). I have to think through this more (and am being called away now), but that's the hypothesis I'll work through when I get a chance later today.

jchodera commented 3 years ago

We should be storing the replicas x states reduced potentials, which do not change after a swap. Only the assignment of state indices to replicas changes.

jchodera commented 3 years ago

@mrshirts Any chance you've been able to take a look at _mix_replicas()?

@jaimergp : Would you be able to help us sort out what is going on here? Or if there's a simple way we can replace the C extension with an alternative accelerator scheme that may not suffer from this issue?

mrshirts commented 3 years ago

So, we have three states, T0, T1, and T2, with three energies, E0, E1, and E2, where the index on the energy says where they were originally sampled from.

Then the matrix self._energy_thermodynamic_states would be, assuming row/column labels, with [replica, thermodynamic_state]

[ E0/T0 E0/T1 E0/T2 ] [ E1/T0 E1/T1 E1/T2 ] [ E2/T0 E2/T1 E2/T2 ]

So that the E labels correspond to replicas labels, and the T labels correspond to thermodynamic state labels.

If we start with self._replica_thermodynamic_states = [0 1 2]

Then: self._energy_thermodynamic_states[1, self._replica_thermodynamic_states[1]] = E1/T1 self._energy_thermodynamic_states[2, self._replica_thermodynamic_states[1]] = E2/T1 self._energy_thermodynamic_states[1, self._replica_thermodynamic_states[2]] = E1/T2

Let's say we swap states 0 and 1 on the first pass. This is implemented by a swap in self.replica_thermodynamic_states, which results in self._replica_thermodynamic_states = [1 0 2]

The next time an exchange is attempted without moving, then the energies stay the same (mod labeling). Let's say the exchange is between 1 and 2 this time. The probability calculation performed is:

log_p_accept = - (energy_12 + energy_21) + energy_11 + energy_22

Where the energies are reduced energies calculated by:

energy_12 = self._energy_thermodynamic_states[replica_i=1, self._replica_thermodynamic_states[replica_i=2]]. energy_21 = self._energy_thermodynamic_states[replica_i=2, self._replica_thermodynamic_states[replica_i=1]]. energy_11 = self._energy_thermodynamic_states[replica_i=1, self._replica_thermodynamic_states[replica_i=1]]. energy_22 = self._energy_thermodynamic_states[replica_i=2, self._replica_thermodynamic_states[replica_i=2]].

Without a swap, log_p_accept = -(E1/T2 + E2/T1) + E1/T2 + E2/T2, which is correct; the difference in total log probability if we swapped the states.

Since self._replica_thermodynamic_states[1]=0 self._replica_thermodynamic_states[0]=1 and after the swap, this seems to be:

energy_12 = E1/T2 energy_21 = E2/T0 energy_11 = E1/T0 energy_22 = E2/T2

So log_p_accept = -(E1/T1 + E2/T0) + E1/T0 + E2/T2

But that's doesn't seem like the comparison that should be done. It seems like it should be:

log_p_accept = -(E0/T2 + E2/T1) + E0/T1 + E2/T2, where the denominators are the same before the swap, but where we swap E1 in place of E2.

So we should be swapping energies, but it's swapping temperatures. So it seems it should be reassigning replica labels after each swap, not thermodynamic state labels.

Does this make sense? Am I missing something? I'm a little tired, so I'm not quite sure in the context of the objects what the solution would be.

jchodera commented 3 years ago

This logic doesn't sound right to me.

Fundamentally, we're dealing with the unnormalized probability density

q(S; X) = \exp{- \sum_{k=1}^K u_{s_k}(x_k)] }

where S = {s_k} is the permutation of state indices, s_k denotes the state index of replica k, u_s(x) is the reduced potential for snapshot x, and X = {x_k} is the set of snapshots, where x_k denotes the snapshot of replica k.

When we accept or reject a proposal from S -> S', we just need to compute

P_accept = \max\{ 1, q(S' | X) / q(S | X) \}

The expressions we use in the code just implicitly cancel out the terms that do not change in the swap.

I still feel like something weird must be going on in the cython code. Have we been able to try it without that, with the pure python implementation? You can just delete the Cython compiled versions from the Python installation, I think, or change these lines to always use the pure python version.

I've just tested the pure Python implementation with a 3-state test, and it seems to be totally correct empirically, as well as mathematically.

jchodera commented 3 years ago

I've made some progress in creating a numba jit alternative that is just as readable as the original Python version. On 100 states, the timing is

Seems to be worth switching, since it also gets rid of the cython problems we've been having. I'll open a PR.

EDIT: Here's a notebook that illustrates what I've done.

jaimergp commented 3 years ago

Wow, that's insane!

jchodera commented 3 years ago

@cwalker7 : Thanks so much for your patience---it took a while for me to get to this!

Can you try out the fix-mixing branch to see if it resolves this? Note that you'll have to conda install numba first.

jchodera commented 3 years ago

@cwalker7 : I've merged #485 in, so you should give master a try instead.

cwalker7 commented 3 years ago

@jchodera Ok I should have the results later this afternoon - thanks for working on this!

jchodera commented 3 years ago

@cwalker7: Just wondering if this was able to fix things for you! The harmonic oscillator tests seem to be working fine.

cwalker7 commented 3 years ago

This is the result using the numba swap-all. Unfortunately it is about the same deviation from the analytical values as before. Updating the plot above, we see: ensemble_check_langevinmove_step10fs_exch_freq100_coll_rate1_vs_mixing_scheme_10_23_20

I haven't yet tried the pure python version.

jchodera commented 3 years ago

Thanks for the update! I'll continue investigating this.

mrshirts commented 3 years ago

Chris, can you test the pure python version? I can then try a fix (I think I know how to do it - just have to remember how I did it in GROMACS 7-8 years ago . . . )

jchodera commented 3 years ago

@mrshirts : I still don't understand what you think is wrong. I'm happy to implement it if you have can explain it in the mathematical language we used in the paper (see https://github.com/choderalab/openmmtools/issues/480#issuecomment-713301802)

mrshirts commented 3 years ago

Well, it's clearly wrong since it gives the wrong distributions. I believe it's a problem of switches not actually carrying along the correct temperature since reduced potentials are being exchanged instead of potentials - it would only occur when the temperatures are different (should be fine with Hamiltonian exchange all at the same T). I'll code up what I think is wrong since I'm not being clear in my explanations (after Chris verifies the pure python, in case that version actually is right).

jchodera commented 3 years ago

Well, it's clearly wrong since it gives the wrong distributions.

Agreed that this seems to be an issue, though I'd love to have an independent quick test that reproduces the problem.

I believe it's a problem of switches not actually carrying along the correct temperature since reduced potentials are being exchanged instead of potentials - it would only occur when the temperatures are different (should be fine with Hamiltonian exchange all at the same T).

That makes absolutely no sense to me because we don't store or operate on potentials---we store and operate on reduced potentials, which are U/(kT).

jchodera commented 3 years ago

I can implement a parallel tempering variant of our tests and see if I can reproduce the issue that way.

mrshirts commented 3 years ago

I can implement a parallel tempering variant of our tests and see if I can reproduce the issue that way.

What are the tests that you currently run? If you only have a Hamiltonian exchange test, then you definitely want a parallel tempering version as well.

mrshirts commented 3 years ago

That makes absolutely no sense to me because we don't store or operate on potentials---we store and operate on reduced potentials, which are U/(kT).

I'm claiming operating on reduced potentials doesn't actually swap temperatures when you do multiple swaps. I realize I explained it poorly, I just need to carve out the time to replicate the GROMACS code. But spending too much time recording lectures for general chemistry.

jchodera commented 3 years ago

What are the tests that you currently run? If you only have a Hamiltonian exchange test, then you definitely want a parallel tempering version as well.

The multistate sampling tests are all here! https://github.com/choderalab/openmmtools/blob/master/openmmtools/tests/test_sampling.py

Check out the TestParallelTempering.

I'm claiming operating on reduced potentials doesn't actually swap temperatures when you do multiple swaps. I realize I explained it poorly, I just need to carve out the time to replicate the GROMACS code. But spending too much time recording lectures for general chemistry.

Swapping the elements of the state indices permutation vector S associated with each replica does have the effect of swapping temperatures when you do multiple swaps.

If you start with S = [0, 1, 2, 3] and then swap to get S = [1, 0, 2, 3] and swap again to get S = [1, 3, 2, 0], the appropriate "temperatures" are still used when extracting from the reduced potential u_kl matrix, where k is the replica and l is the state index at which the reduced potential is evaluated, since u_kl = \beta_l U(x_k).

cwalker7 commented 3 years ago

Pure python version of 'swap-all' gives pretty much the same result as the numba and cython: ensemble_check_langevinmove_step10fs_exch_freq100_coll_rate1_vs_mixing_scheme_10_26_20

mrshirts commented 3 years ago

Pure python version of 'swap-all' gives pretty much the same result as the numba and cython:

Perfect, confirms the issue is algorithmic and not a coding bug. I will work on this in pure python version when I can carve out time over the next couple of days.

Lnaden commented 3 years ago

Quick question since its algorithmic. If I recall correctly the swap-all algorithm is based on the REPEX Independence Sampling algorithm presented in this paper which states (Section III B 2)

For each step of the MCMC sampler, we pick a pair of state indices (i, j), with i != j, uniformly from the set {1, . . . , K}. The state pair associated with the configurations xi and xj are swapped with the same replica exchange Metropolis-like criteria...

One of the large parts of that i != j. In all of the _mix_all_replicas code, the random i and j are chosen by uniform distributions over the n_replicas range, which would permit i = j swap attempts. Am I stating the correct algorithm and would that cause the this large of population deviations?

jchodera commented 3 years ago

One of the large parts of that i != j. In all of the _mix_all_replicas code, the random i and j are chosen by uniform distributions over the n_replicas range, which would permit i = j swap attempts. Am I stating the correct algorithm and would that cause the this large of population deviations?

If you pick i = j to swap, the replica permutation matrix S will be unchanged, so the MCMC operation leaves the probability density p(S | X) unaffected. This should not cause any population deviations---as long as you have MCMC operations that also sample from the true equilibrium p(S | X) mixed in as well.

I'm very puzzled by this, since we just need to sample from p(S | X), and the _attempt_swap() should do this...

jchodera commented 3 years ago

@cwalker7 : Are you using ReplicaExchangeSampler or ParallelTemperingSampler here? I'm wondering if our special-purpose code in ParallelTemperingSampler that more efficiently computes the reduced potentials here may be to blame.

mrshirts commented 3 years ago

I'm pretty sure it's still ReplicaExchangeSampler. Though maybe Chris should try ParallelTemperingSampler just to check - maybe that fixes things :)

cwalker7 commented 3 years ago

Yeah I have been using ReplicaExchangeSampler. I agree it's worth checking the ParallelTemperingSampler.

cwalker7 commented 3 years ago

@jchodera Seems like ParallelTemperingSampler works! This should definitely narrow things down then: ensemble_check_langevinmove_step10fs_exch_freq100_coll_rate1_vs_mixing_scheme_10_27_20 It is also about 13% faster than ReplicaExchangeSampler this case.

jchodera commented 3 years ago

That's odd. Ok, @cwalker7, can you point me to the code that sets up, runs, and analyzes these simulations?

cwalker7 commented 3 years ago

Actually I realized that the temperatures generated from setting the min_temperature, max_temperature in ParallelTemperingSampler do not match the range I provided manually for ReplicaExchangeSampler.

For the ReplicaExchangeSampler I did something like this, where temperature list goes from 200 to 300K over 12 replicas:

for temperature in temperature_list:
    thermodynamic_state = openmmtools.states.ThermodynamicState(
        system=system, temperature=temperature
    )
    thermodynamic_states.append(thermodynamic_state)
    sampler_states.append(
        openmmtools.states.SamplerState(positions)
    )

move = openmmtools.mcmc.LangevinDynamicsMove(
    timestep=simulation_time_step,
    collision_rate=1.0/unit.picosecond,
    n_steps=exchange_frequency,
    reassign_velocities=False,
)

simulation = ReplicaExchangeSampler(mcmc_moves=move, number_of_iterations=exchange_attempts,replica_mixing_scheme='swap-all')

reporter = MultiStateReporter("output.nc", checkpoint_interval=1)
simulation.create(thermodynamic_states, sampler_states, reporter)

For ParallelTemperSampler I am doing this:

for temperature in temperature_list:
    thermodynamic_state = openmmtools.states.ThermodynamicState(
        system=system, temperature=temperature
    )
    thermodynamic_states.append(thermodynamic_state)
    sampler_states.append(
        openmmtools.states.SamplerState(positions)
    )

move = openmmtools.mcmc.LangevinDynamicsMove(
    timestep=simulation_time_step,
    collision_rate=friction,
    n_steps=exchange_frequency,
    reassign_velocities=False,
)

simulation = ParallelTemperingSampler(
    mcmc_moves=move,
    number_of_iterations=exchange_attempts,
    replica_mixing_scheme='swap-all'
    )

reporter = MultiStateReporter(output_data, checkpoint_interval=1)
simulation.create(
    thermodynamic_states[0],
    openmmtools.states.SamplerState(positions),
    reporter,
    min_temperature=temperature_list[0],
    max_temperature=temperature_list[-1],
    n_temperatures=num_replicas,
    )

Somehow, the temperatures in the ParallelTemperingSampler come out to be [163.2, 165.1, 167.1, 169.3, 171.7, 174.3, 177.1, 180.2, 183.5, 187.1, 191.1, 195.3], instead of the expected [200, 207.5, 215.3, 223.4, 231.8, 240.5, 249.5, 258.9, 268.6, 278.7, 289.1, 300].

Can you see what I am doing wrong?

cwalker7 commented 3 years ago

Going into the parallel tempering code, it actually does return 163.2 to 195.3 given a min of 200 and max of 300. https://github.com/choderalab/openmmtools/blob/master/openmmtools/multistate/paralleltempering.py#L157-L159.

I though it may be this:

temperatures = [min_temperature + (max_temperature - min_temperature) *
                            (math.exp(i / (n_temperatures-1)) - 1.0) / (math.e - 1.0) for i in range(n_temperatures)]

But it does not agree with the result from the np.logspace() function.

jchodera commented 3 years ago

Can you share your complete code, @cwalker7?

cwalker7 commented 3 years ago

Ok @jchodera , so I've added the files used for the above tests to the cg_openmm repo, a python package we are working on for modeling coarse-grained foldamers: https://github.com/shirtsgroup/cg_openmm.

The replica exchange simulation setup is located here: https://github.com/shirtsgroup/cg_openmm/blob/master/cg_openmm/simulation/rep_exch.py#L434. This currently uses the 'swap-neighbor' setting.

To run the replica exchange physical validation example, first you will need to install some dependencies listed here: https://github.com/shirtsgroup/cg_openmm/blob/master/devtools/conda-envs/test_env.yaml (sorry we don't have better documentation for this part yet, but this should work with 'conda env create -f test_env.yml')

Then install cg_openmm with 'python setup.py install'.

The physical validation example is located in https://github.com/shirtsgroup/cg_openmm/tree/master/examples/physical_validation. There is a bash script which runs the replica exchange simulation, processes the data in the .nc files, and does the physical validation ensemble check.

cwalker7 commented 3 years ago

@mrshirts @jchodera Here's a separate branch where I've re-tested the ParallelTemperingSampler with 'swap-all'. https://github.com/shirtsgroup/cg_openmm/blob/parallel_temper/cg_openmm/simulation/rep_exch.py#L434 This time, I've specified the temperature list explicitly so we can compare directly with the results for ReplicaExchangeSampler. Seems like there is once again no difference. I can post a separate issue for the automatic temperature spacing. ensemble_check_langevinmove_step10fs_exch_freq100_coll_rate1_vs_mixing_scheme_10_27_20

jchodera commented 3 years ago

I've fixed the temperature initialization for ParallelTemperingSampler in #148 Still looking into the other issues.

asilveirastx commented 3 years ago

Could it not be the case that MD would require an "equilibration" stage if the changes in temperature are too large? Maybe you could try specifying longer MD sim times between swaps and discard a fraction of steps when performing the check ensemble test.

jchodera commented 3 years ago

@asilveirastx:

Could it not be the case that MD would require an "equilibration" stage if the changes in temperature are too large? Maybe you could try specifying longer MD sim times between swaps and discard a fraction of steps when performing the check ensemble test.

The only equilibration that would be necessary is at the very initial burn-in phase. Once the replicas reach equilibrium, it doesn't matter if we take 1 step or 10^6 steps in between the exchanges---the exchange attempts should preserve the equilibrium ensembles.

asilveirastx commented 3 years ago

The only equilibration that would be necessary is at the very initial burn-in phase. Once the replicas reach equilibrium, it doesn't matter if we take 1 step or 10^6 steps in between the exchanges---the exchange attempts should preserve the equilibrium ensembles.

I think this is 100% true when you use MCMC, but there is no formal proof that this is the case for molecular dynamics. Openmmtools has an implementation of Hybrid Monte Carlo so using HMC instead of MD could also help in discarding any issue with MD.