Closed DanielWicz closed 3 years ago
Hi, I can think of two issues.
One is the simulation time. 80ps is under what we tend to consider even the sampling rate for ACEMD3 (which is 100ps). It's true that the system is very fast but we have never evaluated adaptive on systems with such short simulation times. Case in point, if you look at our original adaptive sampling paper https://pubs.acs.org/doi/10.1021/ct400919u Figure 1 you can see that decreasing trajectory length has a negative effect on the performance even if you increase the amount of trajectories.
The other would be TICA. I would not use TICA on this system because a) you know already more or less the best reaction coordinate, the dihedral angles b) TICA also performs better on much longer trajectories where the lag-time can be larger.
There are more problems now that I look at it more. You are minimizing at the start of each trajectory which is not a good idea. Additionally another problem with 80ps simulations is that adaptive doesn't store velocities. This means that (at least if you use ACEMD3) velocities are randomized at the start of each trajectory. So every 80ps you get random velocities which is very bad. In your case I'm not sure what OpenMM does with the velocities but both zeroing and randomizing them is going to contribute negatively.
Thanks for quick reply Stefan
Hi, I can think of two issues.
One is the simulation time. 80ps is under what we tend to consider even the sampling rate for ACEMD3 (which is 100ps). It's true that the system is very fast but we have never evaluated adaptive on systems with such short simulation times.
What do you mean by the sampling rate of ACEMD3 ? It saves trajectory to the file by every 100 ps ? How it looks if you use OpenMM (or other software) like me ? It is hard-coded in ACEMD3 or HTMD ?
Case in point, if you look at our original adaptive sampling paper https://pubs.acs.org/doi/10.1021/ct400919u Figure 1 you can see that decreasing trajectory length has a negative effect on the performance even if you increase the amount of trajectories.
I agree, Especially that TICA does struggle with convergence when simulation is super-short, and the required length grows about the square of the system size/input features.
The other would be TICA. I would not use TICA on this system because a) you know already more or less the best reaction coordinate, the dihedral angles b) TICA also performs better on much longer trajectories where the lag-time can be larger.
How do you set the AdaptiveBandit/Sampling's field projection
to None
? But it is by default None
, so maybe something else ?
There are more problems now that I look at it more. You are minimizing at the start of each trajectory which is not a good idea. Additionally another problem with 80ps simulations is that adaptive doesn't store velocities.
I see! Yea, that's makes things terrible if the system is minimized each epoch and makes adaptive sampling screwed. Thanks for this important notice!
This means that (at least if you use ACEMD3) velocities are randomized at the start of each trajectory. So every 80ps you get random velocities which is very bad. In your case I'm not sure what OpenMM does with the velocities but both zeroing and randomizing them is going to contribute negatively.
What do you mean with the bad influence of the velocities ? In general if you won't reinitialize your thermostat (velocities) after epoch i, then at epoch i+1, the simulation will fallow the same path, as before. Thus it should decrease sampling efficiency of the adaptive sampling.
In your case the saving_frequency
is 100 which I assume is in steps, so 100 * 0.004 = 4ps sampling rate. Yes ACEMD3 saves by default every 100ps but you can change it.
How do you set the AdaptiveBandit/Sampling's field projection to None ? But it is by default None, so maybe something else ?
I think you meant how to disable TICA? You can't disable the projection. Set ab.ticadim = 0
to disable TICA (should be mentioned in the docs I hope).
What do you mean with the bad influence of the velocities ? In general if you won't reinitialize your thermostat (velocities) after epoch i, then at epoch i+1, the simulation will fallow the same path, as before. Thus it should decrease sampling efficiency of the adaptive sampling.
No, adaptive does not spawn all simulations from the same conformation. After the first epoch of simulations it looks at all their frames and tries to find the most interesting conformations from which to start new simulations from. It rarely if ever starts two simulations from the same conformation after the first epoch is done.
But beyond that, no I didn't mean to not reinitialize the thermostat. You need to reinitialize the thermostat because we lose the velocities of the simulations. But if you do it too often you are ruining the simulations. Imagine the most extreme case where your simulation is 1 step long. Then at every step you would be randomizing the velocities with adaptive and as you can imagine it's not MD anymore. It's just some randomized atom movement. 80ps is of course much more than 1 step but in the larger image you are still resetting the velocities too often I believe.
I think I've found another problem - shouldn't the generator accept input from the previous simulation ? Like it should read e.g. structure.pdb
of the AdaptiveBandit/Sampling from the current epoch and start simulation from it. Then how HTMD saves the current epoch structure if I don't use ACEMD ? I should just load structure.pdb
in the generator (as in ACEMD in the tutorial), then HTMD will overwrite it in every epoch ?
Not the structure.pdb
but instead you should always read the input.coor
file for starting coordinates. It's the file which adaptive sampling writes for each new job. This is all automatically done if you use HTMD+ACEMD but you need to take care since you are doing it in OpenMM.
Not the structure.pdb but instead you should always read the
input.coor
file for starting coordinates. It's the file which adaptive sampling writes for each new job. This is all automatically done if you use HTMD+ACEMD but you need to take care since you are doing it, in OpenMM.
There is possibility to change the format from coor to something else ? Because I don't see a way of loading the coor file easily (neither OpenMM nor MDTraj support it).
Have you tried the tutorials on AdaptiveBandit first? Best is to start from there
On Wed, 15 Sept 2021 at 09:58, Stefan Doerr @.***> wrote:
Not the structure.pdb but instead you should always read the input.coor file for starting coordinates. It's the file which adaptive sampling writes for each new job. This is all automatically done if you use HTMD+ACEMD but you need to take care since you are doing it, in OpenMM.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Acellera/htmd/issues/1010#issuecomment-919789366, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3KUOTWPLIFSEYT4HW5YVDUCBG3FANCNFSM5D7UL4DQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
<https://twitter.com/acellera>
https://www.youtube.com/user/acelleracom https://www.linkedin.com/company/2133167?trk=tyah&trkInfo=clickedVertical%3Acompany%2CclickedEntityId%3A2133167%2Cidx%3A2-1-2%2CtarId%3A1448018583204%2Ctas%3Aacellera https://www.acellera.com/md-simulation-blog-news/ http://is.gd/1eXkbS
@DanielWicz https://software.acellera.com/docs/latest/htmd/htmd.adaptive.adaptivebandit.html argument coorname
. Just set it to input.pdb
if you don't like coor
files.
But you can also use HTMD to read the coor file into a numpy array.
from htmd.ui import Molecule
mol = Molecule("input.coor")
print(mol.coords)
@giadefa
Have you tried the tutorials on AdaptiveBandit first? Best is to start from there …
Those below + docs https://software.acellera.com/docs/latest/htmd/tutorials/adaptive-bandit.html https://software.acellera.com/docs/latest/htmd/tutorials/adaptive-sampling.html
@stefdoerr thanks! so it recognizes file-type by the extension, I will check in the moment
Yes Molecule.write()
writes the correct format by extension
I have fallowing problem. The Adaptive Bandit (and Adaptive Sampling with PCCA too) performs worse at sampling than classic Molecular Dynamics. In 2000 epochs (80 ps per epoch, so 160 ns in total) it can't reach right side of the (alpha_L and C7) of the Ramachandran plot.
In short the simulation parameters are fallowing: 1 epoch sim. length: 80 ps TICA dim: 1 TICA lagtime: 50 (20 ps) epsilon: 0.01 The rest is default (e.g. MSM parameters).
The question is: what is wrong ?
And the generator is 80 ps OpenMM simulation with saving frequency of 100 (every 0.4 ps). The script is: