Closed kexul closed 3 years ago
Hi @kexul. The equilibrate
option for somd-freenrgy
doesn't perform a standard molecular dynamics equilibration, rather performs an equlibrate-to-lambda stage where the perturbation is slowly applied from zero to the chosen lambda value prior to running the main part of the simulation. I think @jmichel80 would have a better idea, but my understanding is that this was found to be unreliable in practice and it was easier to just discard the initial portion of the simulation output when post-processing the data to reconstruct free energies, hence why it is set to False
. It also doesn't support constraints, etc. which would be expected for a regular MD equilibration.
You are indeed correct that it is good practice to perform equilibration protocols prior to your main simulation. We make no assumptions that the system passed to SOMD is equlibrated and it is up to the user to make sure that it is. The way that BioSimSpace has been written means that you can use all protocols with a perturbable system. Under the hood we extract the lambda=0 state of the system, run the simulation protocol, then copy updated coordinates and box information back into the system when you call .getSystem()
on a Process
object. (Since we now support read/writing perturbable systems you could also setup the simulation within BioSiimSpace, run it externally, then reconstruct the merge system with the new coordnates.) If needed, you can also choose to use the lambda=1 system instead. Ideally we would also support intermediate values of lambda but the way in which the perturbation is applied isn't consistent between the engines.
The tutorial execution model that @jmichel80 pointed you to in an other issue thread does things slightly differently. Here it minimises and equilibrates the subcomponents of the merged system separately, e.g. the protein plus one ligand in solvent, then extracts the components and re-merges to form the final system. This was needed since we didn't support read/write of perturbable systems at the time of writing and makes life simpler when running sub-stages of the equlibration outside of BioSimSpace.
Thanks for the detailed information and example @lohedges , things are much clear now.
Hi @lohedges , I find the tutorial is quite complicated and I'm here looking for help again.
I would like to clarify the status quo with some code snippets, here is the code I've used earlier:
import BioSimSpace as _BSS
lig1 = ['CHEMBL92812.prmtop', 'CHEMBL92812.inpcrd']
lig2 = ['CHEMBL93461.prmtop', 'CHEMBL93461.inpcrd']
lig1 = _BSS.IO.readMolecules(lig1)[0]
lig2 = _BSS.IO.readMolecules(lig2)[0]
pert = _BSS.Align.merge(lig1, lig2)
pro = _BSS.IO.readMolecules('pro.*')
sys = pert + pro
solvated_sys = _BSS.Solvent.tip3p(sys, box=3*[8*_BSS.Units.Length.nanometer], work_dir='solvation')
free_energy_protocol = _BSS.Protocol.FreeEnergy(num_lam=3)
_BSS.FreeEnergy.Binding(solvated_sys, free_energy_protocol, work_dir='free_energy', engine='SOMD', setup_only=True, ignore_warnings=True)
# Then we could collect the files generated in lambda_* and distribution them to HPC, run simulation with somd-freenrg
We've agreed it's better to equilibrate the system before main simulation,
The way that BioSimSpace has been written means that you can use all protocols with a perturbable system. Under the hood we extract the lambda=0 state of the system, run the simulation protocol, then copy updated coordinates and box information back into the system when you call .getSystem() on a Process object.
according to your words, I wrote the following code:
import BioSimSpace as _BSS
lig1 = ['CHEMBL92812.prmtop', 'CHEMBL92812.inpcrd']
lig2 = ['CHEMBL93461.prmtop', 'CHEMBL93461.inpcrd']
lig1 = _BSS.IO.readMolecules(lig1)[0]
lig2 = _BSS.IO.readMolecules(lig2)[0]
pert = _BSS.Align.merge(lig1, lig2)
pro = _BSS.IO.readMolecules('pro.*')
sys = pert + pro
solvated_sys = _BSS.Solvent.tip3p(sys, box=3*[8*_BSS.Units.Length.nanometer], work_dir='solvation')
minimise_protocol = _BSS.Protocol.Minimisation()
minimised_proc = _BSS.Process.Amber(solvated_sys, minimise_protocol, work_dir='minimization')
minimised_proc.start()
minimised_proc.wait()
minimised_sys = minimised_proc.getSystem()
equilibration_protocol = _BSS.Protocol.Equilibration()
equilibrated_proc = _BSS.Process.Amber(minimised_sys , equilibration_protocol, work_dir='equilibration')
equilibrated_proc.start()
equilibrated_proc.wait()
equilibrated_sys = equilibrated_proc.getSystem()
free_energy_protocol = _BSS.Protocol.FreeEnergy(num_lam=3)
sys = _BSS.FreeEnergy.Binding(equilibrated_sys, free_energy_protocol, work_dir='free_energy', engine='SOMD', setup_only=True, ignore_warnings=True)
# Then we could collect the files generated in lambda_* and distribution them to HPC, run simulation with somd-freenrg
Would these work as you expected?
Here is the input files I've used and the code stopped in minimization step, sander said: Error in numextra_test
.
pert.zip
Hi there. I think there must be a bug in the Sire.IO.AmberPrm
parser that isn't correctly determining the number of virtual sites. This was a previous issue for TIP4/5P water topologies that I fixed, but I imagine that it's still not correctly account for non-water dummy atoms. When I get a chance I'll see if I can figure out what's going wrong.
In the mean time, could you try using a different MD engine, e.g. GROMACS. If you are just using sander then the equilibration would be painfully slow anyway.
(Unrelated to your numextra
issue). Just to note that in your script above you are using solvated_sys
twice, rather than passed minimised_sys
to the equlibration, so it would likely fail anyway.
This issue is because we're using du
as the AMBER atom type for dummy atoms, which isn't defined anywhere. Replacing this with a known name, such as EP
(which is used for TIP4P
water) works. However, this is probably undesirable.
Perhaps @jmichel80 might know the best solution here? This isn't a zero-mass dummy atom so it shouldn't cause problems with the AMBER integrator.
Wow, ultrafast support as always! Thanks so much!
(Unrelated to your numextra issue). Just to note that in your script above you are using solvated_sys twice, rather than passed minimised_sys to the equlibration, so it would likely fail anyway.
Thanks for pointing that out, I've corrected the typo.
In the mean time, could you try using a different MD engine, e.g. GROMACS. If you are just using sander then the equilibration would be painfully slow anyway.
Yes! Gromacs worked. Thank BioSimSpace so that we could switch between different engine painlessly. 🎉
I've tested other MD engine, OpenMM
and SOMD
did not work either.
Output of OpenMM:
/root/group_ceph/FEP/prepare/BioSimSpace/Process/_process.py:606: UserWarning: The system contains a perturbable molecule .We will assume that you intend to simulate the lambda = 0 state. If you want to simulate the lambda = 1 state, then pass {'is_lambda1' : True} in the 'property_map' argument.
_warnings.warn("The system contains a perturbable molecule ."
Traceback (most recent call last):
File "/root/group_ceph/FEP/prepare/test_case/git/run.py", line 16, in <module>
minimised_proc = _BSS.Process.OpenMM(solvated_sys, minimise_protocol, work_dir='minimization')
File "/root/group_ceph/FEP/prepare/BioSimSpace/Process/_openmm.py", line 172, in __init__
self._setup()
File "/root/group_ceph/FEP/prepare/BioSimSpace/Process/_openmm.py", line 223, in _setup
raise IOError(msg) from None
OSError: Failed to write system to 'PRM7' format.
Output of SOMD in somd.out
in minimization step:
Traceback (most recent call last):
File "/data/miniconda3/envs/uii/share/Sire/scripts/somd.py", line 125, in <module>
OpenMMMD.run(params)
File "/data/miniconda3/envs/uii/lib/python3.7/site-packages/Sire/Tools/__init__.py", line 176, in inner
retval = func()
File "/data/miniconda3/envs/uii/lib/python3.7/site-packages/Sire/Tools/OpenMMMD.py", line 1504, in run
system = integrator.minimiseEnergy(system, minimise_tol.val, minimise_max_iter.val)
RuntimeError: Particle coordinate is nan
Thanks, I'll take a look. The OpenMM error is strange given that it writing the same AMBER files as Process.Amber
and Process.Somd
would.
Okay, fixed the OpenMM issue. Now it reports the same nan
error as SOMD, which is what I'd expect, since SOMD uses OpenMM behind the scenes. I've found that OpenMM can be a lot more sensitive to bad contacts, or poorly minimised / equilibrated structures. I imagine that you might be able to minimise with something else, e.g. GROMACS, then equilibrate with OpenMM afterwards.
Hi @kexul would you be happy to email me directly ? I think it could be useful to have a separate discussion about your ongoing work with BioSimSpace.
I imagine that you might be able to minimise with something else, e.g. GROMACS, then equilibrate with OpenMM afterwards.
Thanks for the quick fix @lohedges . 🎉
Hi @kexul would you be happy to email me directly ? I think it could be useful to have a separate discussion about your ongoing work with BioSimSpace.
Of course @jmichel80, I'll email you directly then.
Hi, I found that the
equilibrate = False
insomd.cfg
generated by default free energy protocol, and there is no parameter inProtocol.FreeEnergy
to alter it.I'm new to
somd
and I was usingGromacs
earlier, I used to believe that we need to doNVT
andNPT
equilibration beforeProduction
.Is the system generated by BioSimSpace (i.e. somd.prm7 and somd.rst7) already equilibrated or we do not need to equilibrate the system when using
somd
?