Numerical instability at 4 fs with the LangevinMiddleIntegrator

msuruzhon commented 2 years ago

Hello,

I have been testing the LangevinMiddleIntegrator functionality in Sire by trying to run 4 fs simulations with HMR but I have been running into numerical instabilities both on CPU and CUDA (no such instabilities occur at 2 fs). @jmichel80 @annamherz is this something you are observing during your benchmarks as well? Interestingly, I only see the error when there is a protein, so I am not sure what is going wrong. I tried running a simulation with trajectory output at each step but nothing is obviously wrong at the last timestep before the crash (files attached here).

I also tried it by not perturbing constrained bonds (as far as I know Sire removes the constraints together if that's the case), but I still observed these errors. This is a strange problem, since many of the OpenMM benchmarks are done at this timestep (even 5 fs), leading me to think that this could be related to the alchemical part of the code?

P.S. I just realised that coulomb power is set to 0 by default. This is unlikely to be related to the problem because I also observe the instabilities at lambda=0 and lambda=1. In any case, do you have any recommendations for soft-core parameters that can be used with a unified protocol in SOMD?

Many thanks.

lohedges commented 2 years ago

Are you using your own HMR routine, or the one built in to BioSimSpace? (It looks like you aren't setting HMR in the SOMD config file, which is the correct thing to do if you are adjusting the masses in the topology prior to setup.) Could you share the original inputs (pre-HMR) so that we can check that the masses are being adjusted correctly. The BioSimSpace HMR code should account for perturbable systems, i.e. both end states have their masses repartitioned.

msuruzhon commented 2 years ago

Thanks @lohedges I am using the HMR routine from ParmEd but the implementation should be the same between both packages. I just tested the BioSimSpace function and I get the same masses as ParmEd if I use a factor of 3.00015774639 (accounting for the difference in the hydrogen mass precision between ParmEd and BioSimSpace). I have attached my input files here, where the system was first prepared with bss_prep.py and afterwards run with bss_run.py. So it seems that the problem is not in the masses?

lohedges commented 2 years ago

I wouldn't have thought so. I'll check at my end tomorrow. I'll also take a look at the SOMD code to see if it does anything else if HMR is enabled.

Cheers.

lohedges commented 2 years ago

I've just taken a quick look and, as you say, everything looks fine with the masses. I also checked the HMR routine in OpenMMMD.py and nothing extra is being set once the masses are adjusted.

jmichel80 commented 2 years ago

It would be useful to ask @halx if he has particular suggestions to ensure stable SOMD simulations with 4 fs+HMR. As a sanity check, could you repeat tests at 4 fs with constraint = allbonds ?

msuruzhon commented 2 years ago

Thanks @jmichel80, strangely constraint = allbonds crashes now even at 2 fs, and I am not sure why.

jmichel80 commented 2 years ago

can you share more details about the pre-SOMD equilibration protocol you are using ?

msuruzhon commented 2 years ago

I am using a defensive restrained protocol with OpenMM that runs fine at 4 fs with HMR. However, before running into these issues with SOMD, I pre-minimise the structure in SOMD as well, so there should be no instabilities arising from this either way.

annamherz commented 2 years ago

Hello! Sorry for taking so long to reply, I've also been having issues with SOMD crashing w all runs (2 and 4 fs) and am still trying to figure those out before testing the integrator. I don't know if they're the same issues you are experiencing @msuruzhon ?

lohedges commented 2 years ago

I'm beginning to wonder if this has anything to do with this, or perhaps the OpenMM atom re-ordering issue?

What do the trajectory files look like?

annamherz commented 2 years ago

I think it is to a certain extent - if I use the recently generated input files with the 2021 sire, the crash is related to the input files having the ligands at the end rather than at the start:

Exception 'SireError::invalid_key' thrown by the thread 'master:main'.
There is no perturbations template for the molecule with name THR available templates are [ LIG ].
Thrown from FILE: /home/sireuser/Sire/corelib/src/libs/SireIO/perturbationslibrary.cpp, LINE: 1171, FUNCTION: SireMol::Molecule SireIO::PerturbationsLibrary::applyTemplate(const SireMol::Molecule&) const

However, if I use the older input files (that have the ligand at the first position) and try to run them with the 2022 version of Sire, I get the following (which I think is unrelated to the atom reordering?):

  File "/home/anna/sire.app/share/Sire/scripts/somd-freenrg.py", line 146, in <module>
    OpenMMMD.runFreeNrg(params)
  File "/home/anna/sire.app/lib/python3.8/site-packages/Sire/Tools/__init__.py", line 176, in inner
    retval = func()
  File "/home/anna/sire.app/lib/python3.8/site-packages/Sire/Tools/OpenMMMD.py", line 1662, in runFreeNrg
    system = integrator.minimiseEnergy(system, minimise_tol.val, minimise_max_iter.val)
RuntimeError: Error loading CUDA module: CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222)

lohedges commented 2 years ago

For the first exception you list, is this just from downgrading Sire, or BioSimSpace too? It seems related to this issue, which was added in the 2022.1.0 release of BioSimSpace.

For the second, this might be a result of compiling (or recompiling) Sire using an old or new version of OpenMM without changing the CUDA toolkit version. (The new conda-forge version pulls in cudatoolkit). It doesn't seem related to the input files, rather the CUDA configuration / drivers.

msuruzhon commented 2 years ago

Thanks @annamherz and @lohedges I think we might have different issues because with normal hydrogen constraints I get stable simulations at 2 fs and only get instabilities at 4 fs in the bound legs for some reason.

annamherz commented 2 years ago

For the first exception you list, is this just from downgrading Sire, or BioSimSpace too? It seems related to this issue, which was added in the 2022.1.0 release of BioSimSpace.

It was from just downgrading to older sire and using the current amber bss branch to setup the inputs. I got the first exception when using the new input files with the most recent sire also. Not sure yet if it is maybe a result of how I prepare the system for FEP, or the amber branch itself, that leads to the perturbed molecule being at the end of the input files, but I've done some tweaks in the amber branch to incl the perturbed residue number in the somd.cfg file and that seems to be working for now!

Thanks @annamherz and @lohedges I think we might have different issues because with normal hydrogen constraints I get stable simulations at 2 fs and only get instabilities at 4 fs in the bound legs for some reason.

With all the other stuff working now I'll try running some at 4fs and let you know if I also get instabilities.

annamherz commented 2 years ago

Hello!

I have attached my input files here

I was unsuccessful with running the transformation with these input files, but when using other input files have been able to run transformations using the langevinmiddle integrator without issue. My cfg files looked something like this . The pre equilibrated input files I used are here and I ran ejm55 to 54, 54 to 42 and 42 to 55. Using the inputs from the open forcefield benchmark set also worked, although the protein pdb required significant manual editing in order to be able to run in BSS. These input files are here (ligands only are not equilibrated as I only tested the bound leg to see if it crashes).

msuruzhon commented 2 years ago

Many thanks for that @annamherz . When you say "manual editing" of the PDB file, what does this actually entail? Because I have had PDB file issues with the other systems from the test set, so this might be related to that.

annamherz commented 2 years ago

When you say "manual editing" of the PDB file, what does this actually entail?

So initially when I loaded the protein into BSS, preparing it with the forcefield failed. Instead I loaded the protein into tleap with ff14SB and tried saving it as prm7 and rst7 files. This also failed as expected, but I went through the error messages and deleted all of the atoms that did not have a type, which were mainly H (HG1,HB1,HE1,HD1,HA1,HD2,HD3 and some H in ACE). I then renamed all CD in ILE to CD1 as CD does not have a type in ILE either. This was then able to be paramaterised in tleap without issues, and I then loaded the prm7 and rst7 files from that into BSS and proceeded with the ligand prep and fep as usual.

mark-mackey-cresset commented 2 years ago

Just a comment from our POV - we get generally-stable free energy simulations using LangevinMiddle with a hydrogen mass reweighting of 1.5 at 4fs - might be worth your giving that a try?

jmichel80 commented 2 years ago

hi @mark-cresset as far as I can tell this is the setting that is being tested (HMR x1.5 LangevinMiddle 4 fs). We have yet to ge to the bottom of this, but it appears the protein preparation step has an impact on the stability of the downstream FEP simulations.

michellab / Sire

Numerical instability at 4 fs with the LangevinMiddleIntegrator #370