RATTLE Implementation - Githubissues

ejmeitz commented 1 year ago

Implemented iterative scheme for RATTLE....completely untested but I have a many clarifying questions/thoughts.

Its unclear to me if these implementations work with all integrators. The derivations all use VV. I don't think other MD codes limit you to this but I don't really see why these methods are generally applicable (they definitely shouldn't work for NVT & NPT out of the box)
- Related to this I'm not sure if RATTLE should be called after half-step updates or before or on both. I more or less get the math but not how these constraints should be integrated into a real MD code.
I think the whole apply_constraints! interface will need to change as RATTLE and SHAKE have to be applied at different points in the integration process. That said not sure how to fix that at the moment in a pretty way.
What does your SHAKE implementation follow? I used this for RATTLE which has the SHAKE steps in it also. I've never seen that quadratic formula to find g in any paper. Granted every paper I read does something totally different.
LAMMPS docs have me a little confused. They say they solve RATTLE analytically (and the source matches that statement) but they also said they follow the Andersen paper which solves it iteratively. I think analytically would be better but not really clear how that is done.

codecov[bot] commented 1 year ago

Codecov Report

Attention: Patch coverage is 92.45283% with 16 lines in your changes are missing coverage. Please review.

Project coverage is 72.44%. Comparing base (744f11b) to head (f15637d). Report is 2 commits behind head on master.

Files	Patch %	Lines
src/types.jl	47.82%	12 Missing :warning:
src/constraints/constraints.jl	97.18%	2 Missing :warning:
src/constraints/shake.jl	97.46%	2 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #132 +/- ## ========================================== + Coverage 72.18% 72.44% +0.26% ========================================== Files 35 36 +1 Lines 5278 5444 +166 ========================================== + Hits 3810 3944 +134 - Misses 1468 1500 +32 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jgreener64 commented 1 year ago

I more or less get the math but not how these constraints should be integrated into a real MD code.

I agree this is not clear and much of implementing this correctly will be calling the constraint functions at the correct point. See for instance OpenMM, which has examples of where to call the constraints for Verlet and Langevin dynamics. There is also a Julia RATTLE implementation here but I don't know how good it is.

I think the whole apply_constraints! interface will need to change as RATTLE and SHAKE have to be applied at different points in the integration process.

Yes I think we might want to switch to constrain_coords! and constrain_velocities! and call them at the appropriate points. Whether this transfers to other constraint algorithms remains to be seen, but it should be enough for SHAKE/RATTLE.

What does your SHAKE implementation follow?

I think it follows LAMMPS, you might get more from the original PRs (#82 and #84).

LAMMPS docs have me a little confused.

Look in the source code if possible?

ejmeitz commented 1 year ago

Just some notes from LAMMPS SHAKE source:

Unconstrainted position update "assumes NVE update, seems to be accurate enough for NVT,NPT,NPH as well" Line 1685
Shake bond constraint line 1774 (our existing code is basically a copy of this)
2 & 3 bond clusters lines 1877 & 2051
2 bonds 1 angle line 2303

ejmeitz commented 1 year ago

Useful threads from LAMMPS

Things to remember:

Only works when positions and velocities use the VV update step. So something like NoseHoover is still safe, with NPT the order matters a lot
Dont use during minimization
No 180 deg molecules

ejmeitz commented 1 year ago

The replica system constructor will not parse the constraints correctly as it does not actually call the System constructor as a function rather it specifies all of the types explicitly.

ejmeitz commented 1 year ago

Anyone have any clue why when I simulate hydrogen gas the rotational DoF suddenly freeze? Basically the first 200 timesteps look normal then rotations completely stop and the molecules just move around the box constrained interacting as if they cannot rotate. When I simulate the same thing in LAMMPS the H2 molecules spin like crazy.

I'm not really convinced this is an artifact of my SHAKE implementation but I don't know what else could cause it.

jgreener64 commented 1 year ago

Could you give code to reproduce the issue?

It might also be worth comparing to the same simulation without constraints, i.e. with a strong harmonic bond and a small time step. If that simulation doesn't show the issue then it is probably something to do with the constraints.

ejmeitz commented 1 year ago

This is the input I am using. I uploaded the xyz file for this with 1 molecule (where you can also see what I am talking about). In the 1 molecule case the two atoms do not interact with each other so this rotation is just from the initial velocity. It might look like the molecule does not translate from the initial velocity but I think they do and its just small. The weirder thing to me is that the particles stop rotating and moving. After looking at all the output I noticed that the velocity vectors eventually end up pointing towards the COM of the molecule effectively canceling each other out. Why this is the case I have no idea. I tested the exact same system in LAMMPS with SHAKE and this does not happen. I plan to test the original implementation of SHAKE in Molly (I have only changed how it is initialized not the algorithm itself) to see if this phenomena occurs as well and also the harmonic bonds.

Edit: Tried with just Verlet algorithm as well and the same issue pops up. Edit: Tried with version of SHAKE in Molly atm and the same thing happens. In the code from the tests the particles are much closer so they interact and keep rotating, but if I change it to have the parameters below then all rotationg stops because the gas is so diffuse. I'm not convinced this is the correct behavior.

pos.zip


r_cut = 8.5u"Å"
temp = 300.0u"K"

#Initialize atoms (0.0252259036145 molecules / nm^3 typical for H2 gas at STP)
n_atoms_half = 200
atom_mass = 1.0u"u"
atoms = [Atom(index = i, mass=atom_mass, σ=2.928u"Å", ϵ=0.074u"kcal* mol^-1") for i in 1:n_atoms_half]
max_coord = 200.0u"Å" # nm
coords = [max_coord .* rand(SVector{3}) for i in 1:n_atoms_half]

#Add bonded atoms
bond_length = 0.74u"Å" #hydrogen bond length
constraints = []
for j in range(1, n_atoms_half)
    push!(atoms, Atom(index = j + n_atoms_half, mass = atom_mass, σ=2.928u"Å", ϵ=0.074u"kcal* mol^-1"))
    push!(coords, coords[j] .+ SVector(bond_length,0.0u"Å",0.0u"Å"))
    push!(constraints, DistanceConstraint(SVector(j, j+n_atoms_half), bond_length))
end

constraint_algorithm = SHAKE(similar(coords))

velocities = [random_velocity(atom_mass, temp) for i in 1:length(atoms)]
# velocities = [[0.0u"nm/ps", 0.0u"nm/ps", 0.0u"nm/ps"] for i in 1:length(atoms)]
boundary = CubicBoundary(max_coord)

neighbor_finder = DistanceNeighborFinder(eligible = trues(length(atoms),length(atoms)), dist_cutoff = 1.5*r_cut)

sys = System(
        atoms = atoms,
        coords = coords,
        boundary = boundary,
        velocities=velocities,
        pairwise_inters=(
            LennardJones(
                cutoff = ShiftedPotentialCutoff(r_cut),
                use_neighbors = true,
                energy_units = u"kcal * mol^-1",
                force_units = u"kcal * mol^-1 * Å^-1"
                ),
            ),
        neighbor_finder = neighbor_finder,
        constraints = constraints,
        constraint_algorithm = constraint_algorithm,
        loggers=( 
            tot_eng = TotalEnergyLogger(1),
            coords_out = CoordinateLogger(1),
            vels_out = VelocityLogger(1)
            ),
        energy_units = u"kcal * mol^-1",
        force_units = u"kcal * mol^-1 * Å^-1"
        )

simulator = VelocityVerlet(dt = 0.002u"ps")
simulate!(sys, simulator, 100_000, n_threads = 1)

ejmeitz commented 1 year ago

I've found the issue, our original implementation of SHAKE was only modifying the positions when it needed to be modifying the forces and it was doing so in the incorrect place anyways.

There is still the units issue I messaged you on Slack Joe, but for documentation I will explain here as well. In the working version I modify the accelerations which have some bizarre units kcal Å^-1 mol^-1 u^-1 but the acceleration I calculate inside of SHAKE has units of Å/ps^2. Converting from one to the other just requires a factor of 418.6 and to realize that 1 u is 1 g/mol. I did this conversion manually to check that everything works (it appears to) but Unitful is unable to handle this conversion. Likely because the original acceleration could not cancel the molar part.

jgreener64 commented 10 months ago

I am having a look at this but it is a large and complex change and I am not so familiar with the algorithms. There are a couple of thoughts I have so far:

The approach of modifying the accelerations rather than the coordinates/velocities may be limiting. I guess this is how LAMMPS does it but it probably means more maths to get it working for other simulators. Does this make it harder to support things like half-step velocities and user-defined simulators? We would want to support Langevin with this PR, for example. An approach that applies constraints to coordinates or velocities with two functions would allow those functions to be called at the end of the integration step, or at least in a more understandable way. It might be worth seeing what other software does, for example the OpenMM custom integrator interface which is compatible with constraints.
The docs describe compatibility with SHAKE and RATTLE per simulator but I'm not sure this is the way it should be presented. As I understand it SHAKE is sufficient when velocities are not directly used, and if velocities are used then both SHAKE and RATTLE are required. So the user doesn't need to know which simulator works with which, but rather whether that simulator supports correct constraints or not. From my understanding all do here all apart from Langevin and LangevinSplitting (MetropolisMonteCarlo and SteepestDescentMinimizer wouldn't tend to use constraints).
It would be nice to see the performance of constraints. A test of the implementation and of performance would be to constrain all bonds to hydrogen in a simulation of a protein similar to that in the docs.

Let me know what you think. There's a lot of good work in here but I want to make sure it's a flexible and accurate approach to constraints. We shouldn't require too much caution on behalf of the user when using constraints, which basically all biomolecular simulations will do.

ejmeitz commented 10 months ago

I believe you have to modify the forces so that the forces used in the velocity update also satisfy the constrained equations of motion. If you only update the positions the forces used to update velocities will be different from the forces used to update the positions (for F(t) and F(t+dt)). I will check the other codes to see what order they do things in, but based on the derivation I did awhile ago the Lagrange multiplier you derive modifies the force which indirectly modifies the positions. I'm not sure why this would limit other algorithms from having constraints. As long as the scheme integrates position SHAKE will work and as long as the scheme integrates velocitie RATTLE will work. So from what I understand Langevin should support SHAKE (pretty sure I coded that) but not RATTLE as the velocities in Langevin are not actually integrated rather interpolated/extrapolated.
This is related to the last few sentences of the last answer. RATTLE only works on schemes that have a velocity-verlet like velocity integration step. At least that is my understanding. This was a question I never really found an exact answer to though. Like LAMMPS lets you use RATTLE/SHAKE on integrators the original papers would suggest should not work. If you know anyone with expertise here that could be helpful.
Completely agree, I plan on doing a bunch of tests and finishing this after my qualifying exam next week :)

What kind of interface are you looking for with constraints? From the user POV they just choose SHAKE/RATTLE and define the distance constraints and the dispatch handles the rest.

Also I have derivations of SHAKE and RATTLE fully in LaTeX that do the steps skipped in most papers if you would like to see those.

jgreener64 commented 10 months ago

Okay I get it a bit more now. I think I am getting confused about why the choice of SHAKE or RATTLE is given to the user. Each integrator should either be correct with SHAKE (if the velocities are derived) or RATTLE (if they are integrated). Is that correct? So perhaps we could call the constraint method SHAKERATTLE or similar, and just apply it appropriately as you are in the integrators.

By interface I guess I mean docs about how to add constraints to new simulators. From your implementation I see the following two rules, does this sound correct?

Apply constraints to the accelerations after they are computed each step.
If the velocity update uses the stored accelerations from the last step, you also need to apply the velocity constraints after the velocity update.

That makes me think of one question, which is related to my mentioning of Langevin before: do the Langevin velocities need to be corrected to satisfy constraints after the noise is added? At the minute the coordinate update with noised velocities will lead to the coordinates violating constraints. This might be worth comparing to other software.

It would be useful to upload the derivations here if you have them, e.g. as PDF, it will be a useful reference.

I plan on doing a bunch of tests and finishing this after my qualifying exam next week

Great, good luck.

ejmeitz commented 9 months ago

I think it is totally reasonable to combine SHAKE & RATTLE into one user-facing class, but we have to be sure to say that if the integrator does not support rattle then velocity constraints will not be done.

Yep this all sounds correct.

Apply constraints to the accelerations after they are computed each step. *If the velocity update uses the stored accelerations from the last step, you also need to apply the velocity constraints after the velocity update.

There is one extra caveat that I don't know how to handle yet. When you apply constraints they have to be the last thing that modifies the accelerations in that step. So as it stands constraints will not play nice with the recalculate_forces function. I was not super sure how to fix this since the recalculate_forces is currently at the end of the update.

Langevin works with SHAKE as is and never will with RATTLE. The randomness is applied to the velocities at t + dt. These velocities affect the position/force update of the next iteration and then the constraint forces are calculated from that. So I think this should work as the randomness is added to the forces before the constraints are calcualted. I ran a simulation of a bunch of H2 molecules w/ Langevin and the constraints were satisfied. The temperature history is below which doesnt seem like the best thermostating but it does get the mean right and I'm not sure what else to compare to here.

7 and 8 in this document have the SHAKE & RATTLE derivations my_derivations.pdf

jgreener64 commented 9 months ago

So as it stands constraints will not play nice with the recalculate_forces function.

I think if you add a apply_position_constraints! line as follows

        if recompute_forces
            accels_t = accelerations(sys, neighbors; n_threads=n_threads)
            # Add apply_position_constraints!
        else
            accels_t = accels_t_dt
        end

then this is okay. The forces already calculated this step have been constrained, used and discarded, and these recomputed forces are constrained ready for the next step.

These velocities affect the position/force update of the next iteration and then the constraint forces are calculated from that.

But the noised velocities are used immediately after in the same step to update the coordinates. So those coordinates will not necessarily have constraints satisfied. I don't know how much of a problem this is.

Thanks for the derivations.

ejmeitz commented 9 months ago

So those coordinates will not necessarily have constraints satisfied. I don't know how much of a problem this is.

Hm yeah I'll have to look into the consequences of this. To be honest I'm not sure why the constraints are still satisfied the way its coded at the moment.

In LAMMPS Langevin is implemented as only a modification to the forces:

It only modifies forces to effect thermostatting. Thus you must use a separate time integration fix, like fix nve to actually update the velocities and positions of atoms using the modified forces.

which would work fine with the constraint framework as you just calculate the constraint forces after the Langevin forces. I might re-write Langevin so that this is the case.

I can re-write Langevin to do something like the scheme in section 3 here. Eqn 10 forces a Verlet update but you could add Langevin dynamics to any integrator with Eqn 9 in that document. Basically f(t) comes from the chosen integrator and the random forces + damping are added on top. This is what LAMMPS does and would work in the constraint framework. The two options I like best are:

Re-write Langevin to force a Verlet update
Separate Langevin as an add-on that can be used with any integrator (not sure how this should look)

ejmeitz commented 7 months ago

Todo list:

[x] Get constraints to play nice with recompute_forces as outlined above
[x] Rotation issue
[x] Fix Langevin constraints so they match OpenMM docs
[ ] Thorough external testing, comparisons to LAMMPS and OpenMM
[ ] Thorough internal test coverage
[x] Docs/Doc strings
[x] Change interface to combine SHAKE and RATTLE into one (still not sold on this)
[ ] Constraint logger (maybe do in later PR)

Things left for a future PR(s):

Multi-atomic distance constraints (framework is already there for SHAKE/RATTLE)
Angle constraints
- SETTLE
Other constraint algorithms (LINCS, CCMA)

jgreener64 commented 4 months ago

A few general comments:

In VelocityVerlet/Langevin/NoseHoover it might be possible to keep the simulators clean by combining the three steps reset_vel_correction!, apply_position_constraints! and addto_vel_correction!. Since these always occur together they could be one function, e.g. a version of apply_position_constraints! with a flag to indicate that it should do the relevant storage.
I think we should just keep the constraints tuple in System, rather than having a separate constraint_algorithm field. For instance constraints could be (SHAKE_RATTLE(...),) where the atom information is given to SHAKE_RATTLE. This means it is possible to use different constraint algorithms in the same simulation if desired. The presence of constraints can be checked by checking for the empty tuple or nothing.
What is the status of the tests passing and the commented out tests?

ejmeitz commented 4 months ago

I'll look into implementing the first two points, shouldn't be too hard.
I forgot to uncomment the tests, I manually ran all the simulators locally for hydrogen and they work well. The SHAKE code is identical to before so I'd be really surprised if that breaks. I think the only extra test that should be written is an addition to the poly-atomic test to also check RATTLE constraints.
And of course NoseHoover is broken now that we constrain positions instead of forces. I've tried a couple permutations of how the constraints are applied. The OpenMM integrator is hard to learn much from since it is the more advanced form and also integrates at half steps whereas I originally coded NoseHoover to operate like VelocityVerlet.
There is a protein test but I just have it constraint hydrogen bonds right now, I do not think it is super relevant to testing the code and will probably remove it until we have more complicated constraints.

ejmeitz commented 4 months ago

For the first point I think I'd have to dispatch on which simulator is being used, so ultimately will be more code generated but the simulate! function would stay cleaner
Working on this. Is there a reason to not use a class to represent no constraints vs. nothing? Also every constraint algo uses different internal storage, long term if we think having multiple algorithms active is useful (probably is) we should make this storage shared.
Thoughts on not supporting NoseHoover for this first pass? That integrator should realistically be updated to NoseHooverChain at some point anyway.

jgreener64 commented 4 months ago

Is there a reason to not use a class to represent no constraints vs. nothing

I guess it could be a type, but I see it a bit like the interactions where the default is () and whatever tuple is given is iterated over. So maybe () is a better default than nothing. If it is a type then we would have to allow a type or tuple of types for multiple sets of constraints, which might get confusing.

Thoughts on not supporting NoseHoover for this first pass?

Fine by me.

ejmeitz commented 4 months ago

Tests passed locally, just to summarize this PR.

New Features:

Rattle implemented for VelocityVerlet
Velocity corrections added to all integrators that use SHAKE
Unified RATTLE/SHAKE interface into one class
Setup constraint framework so that it will be easy to add more complicated algorithms
A function to calculate the # of DoF lost by using SHAKE, this correction is applied when calculating tempearture

Things to double check before merging:

Energy conservation with constraints, maybe do a parity check with OpenMM for Langevin
Check DoF function, I am not sure it is generic

What I was unable to achieve:

Constraints for NoseHoover, Overdamped Langevin, or LangevinSplitting. I think each of these should be possible to implement they just are not now.
Angle constraints (should be a relatively easy new PR)
Faster version of SHAKE that can be parallelized (also left to another PR)

jgreener64 commented 4 months ago

Nice. I'll try and review this properly over the next few days.

From an initial pass, could you remove all commented-out code and anything that is just setting up for later.

Also the field in System can remain as constraints, and I don't think the eligible matrix should be updated automatically for constraints (we leave it to the user to setup elsewhere for clarity).

ejmeitz commented 4 months ago

Why not do the eligible matrix automatically? It is something that HAS to be done when you use constraints. I don't really see a scenario where having that be automatic would break things?

jgreener64 commented 4 months ago

For the standard molecular case pairwise interactions are excluded for constrained atoms, but there are more general simulation cases where people may want other behaviour.

ejmeitz commented 4 months ago

Can I add a flag for this? I think by default we should do this, it is what I would expect the code to do and also a far more common use case imo. (then again Ive never simulated macromolecules).

I guess I dont see how even with many body interactions you wouldnt want this behavior. Arent intra molecular forces usually off?

ejmeitz commented 4 months ago

I removed the automatic update to the eligible list, but I've left the function to disable the interactions given a list of constraints. I also updated the docs code to reflect that this should be done with the code I put there.

ejmeitz commented 4 months ago

Note to self, add test for Overdamped Langevin

jgreener64 commented 3 months ago

Nice one, thanks :tada:

JuliaMolSim / Molly.jl

RATTLE Implementation #132

Codecov Report

Todo list:

Things left for a future PR(s):