Open Teohoho opened 5 years ago
We've noticed this as well---there seems to be a problem with the unbiasing of restraints in the Computing restraint energies...
step. @andrrizzi and I have been working on a fix, but for now, we suggest you start with the restraint off (lambda_restraints
starts with 0
in the fully-interacting states when other lambda
values are 1) and turn the restraint on (to 1
) as the other lambda
s are turned to 0. This will mean that unbiasing will not be used.
You can also use the --skipunbiasing
command-like argument during analysis, but this may not correctly account for the effects of the restraint.
Hi @Teohoho . That step is done on the CPU. You can switch to GPU it by changing this line to use "CUDA"
or "OpenCL"
instead, but you may end up with an even slower calculation in this case. The reason is that we recompute the restraint energies/distances using a "slice" of the system consisting of only the restrained atoms and the restraint force. The GPU shouldn't add much advantage in computing the energy (which is just a single harmonic term), but it should add some overhead when sending and receiving data. It's still worth the shot if you want to benchmark it. If you do, it would be very helpful if you let us know how it went!
As a workaround, as suggested above, you can start your simulation using a protocol with lambda_restraints
starting at 0. Or you can use a flat-bottom restraint and use --skipunbiasing
as the bias introduced by an active flat-bottom restraint in the bound state is usually negligible if the radius is large enough.
Thank you @jchodera. I was thinking of using "--skipunbiasing", but I'm afraid that that will give an incorrect value for the Absolute Binding Energy. Also, when I run a simulation with "Online Analysis" turned on, I noticed YANK calculates the Absolute Binding energy using MBAR. Is there a way I could get it to print out this value, so as to skip "analysis" entirely? Also also, since I've already started this issue/thread, I'd like to tell you about some of the issues I've had with running Yank for a few months now. None of these are imperative, but I think they are important for the 1.0 release:
The examples you provide in the "yank-examples" package can not be ran as they are, as the "number_of_iterations" flag is no longer supported by YANK (at least in the version I'm using)
the "yaml" files YANK generates can not be used to run the same simulation again. This might not be an issue, as this may not be the intended use of these files. Disregard if this is the case.
@andrrizzi I will run an analysis on a 100 iteration simulation of my system and will communicate the results. Do you think maybe lowering the number of restrained atoms (in the receptor part) will speed up the analysis?
EDIT: I have done the comparison benchmark. The results are:
It is a 15% increase in speed when using CUDA. For this particular analysis it might not be that noticeable, but I wonder for an analysis that took 4 days using CPU, how large a difference will the CUDA Analysis make.
I think it might be an issue with the speed of MDTraj's implementation of image.
Thanks for the feedback! Those are known problems and we're planning to fix them eventually.
Is there a way I could get it to print out this value, so as to skip "analysis" entirely?
You can use python to instantiate a Reporter
and read the online analysis using this method: https://github.com/choderalab/yank/blob/master/Yank/multistate/multistatereporter.py#L1194 . Keep in mind that those values do not use the unbiasing so it'll be equivalent to using --skipunbiasing
. You'll also have to sum complex and solvent free energy yourself and add the appropriate standard state correction to recover the value given by the analysis.
Do you think maybe lowering the number of restrained atoms (in the receptor part) will speed up the analysis?
As @404random said, this might help with the imaging of the trajectory, which is the slowest part of the analysis. It may also boost a little the computation of the centroids used to determine the harmonic restraint radius.
@andrrizzi I see. I will run a simulation with the same system but will reduce the number of restrained atoms to a minimum (3-4 atoms, as opposed to the 30-something I have now) and will get back to you on the difference in analysis time.
Sounds good. Consider also following the suggestion from @jchodera and start your simulations with lambda_restraints: [0.0, ...]
. That removes the need for unbiasing completely and the analysis should be much faster. If your ligand is not a weak binder, and you're not worried that it may dissociate during the simulated timescales, the calculation will be correct.
Ok, so should the "lambda_restraints" parameter start at 0, then go up to 1 before the other "lambda" become less than 1, or should I start to increase the restraint parameter only when one of the other "lambda" values is equal to 0?
Also, it is important for my experiment that the ligand stay between to alpha-helixes, so it drifting away might be an issue.
We don't have a clear answer to that question yet. Somewhat speculating, as long as the lambda_restraints
is 1.0 when you start turning off lambda_sterics
from 1.0 to 0.0, it should be fine.
Also, it is important for my experiment that the ligand stay between to alpha-helixes, so it drifting away might be an issue.
I see. you could run a short calculation, check the trajectory of the first state to see there's too much movement and extend them if it is acceptable.
Thank you @andrrizzi, you have been most helpful!
Hello again! I am having an issue with running the analysis on one of my simulations. It is tangentially related to this issue, but if you consider it not to be, I will gladly open another issue.
When I try to analyze the output of one of my simulations, this is my output:
"2019-02-15 10:54:44,677: Reading energies... 2019-02-15 10:54:44,760: Done. 2019-02-15 10:54:44,764: Assembling effective timeseries... 2019-02-15 10:54:44,795: Done. 2019-02-15 10:54:44,804: Reading energies... 2019-02-15 10:54:44,888: Done. 2019-02-15 10:54:44,895: Assembling effective timeseries... 2019-02-15 10:54:44,927: Done. 2019-02-15 10:54:44,958: Reading energies... 2019-02-15 10:54:45,009: Done. 2019-02-15 10:54:45,012: Assembling effective timeseries... 2019-02-15 10:54:45,046: Done. 2019-02-15 10:54:45,050: Reading energies... 2019-02-15 10:54:45,108: Done. 2019-02-15 10:54:45,110: Assembling effective timeseries... 2019-02-15 10:54:45,133: Done. 2019-02-15 10:54:47,355: Reading energies... 2019-02-15 10:54:47,511: Done. 2019-02-15 10:54:47,515: Assembling effective timeseries... 2019-02-15 10:54:47,540: Done. 2019-02-15 10:54:47,566: Checking if we need to unbias the restraint... 2019-02-15 10:54:47,566: Trying to get radially symmetric restraint data... 2019-02-15 10:54:47,566: Retrieving end thermodynamic states... 2019-02-15 10:54:52,185: Isolating restraint force... 2019-02-15 10:54:53,357: Deep copying restraint force... 2019-02-15 10:54:53,358: Retrieving particle masses... 2019-02-15 10:54:53,358: Done. 2019-02-15 10:54:53,362: Found HarmonicRestraintForce restraint. The restraint will be unbiased. 2019-02-15 10:54:53,362: Computing restraint energies... HDF5-DIAG: Error detected in HDF5 (1.10.1) thread 140098480371456:
major: Dataset
minor: Read failed
major: Dataset
minor: Read failed
major: Low-level I/O
minor: Read failed
major: Data filters
minor: Filter operation failed
major: Data filters
minor: Read failed
major: Data filters
minor: Unable to initialize object
Traceback (most recent call last): File "/home/teo/miniconda3/envs/env_python3.6/lib/python3.6/site-packages/yank/multistate/multistateanalyzer.py", line 404, in get value = instance._cache[self.name] KeyError: 'mbar'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/teo/miniconda3/envs/env_python3.6/lib/python3.6/site-packages/yank/multistate/multistateanalyzer.py", line 404, in get value = instance._cache[self.name] KeyError: 'unbiased_decorrelated_u_ln'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/teo/miniconda3/envs/env_python3.6/bin/yank", line 11, in
I must mention that this simulation has been stopped and restarted around 4 times, and the yaml script modified ( I wanted fewer iterations than initially planned). What can I do to solve this issue and run my analysis on the simulation? Thanks!
UPDATE: If I run the same command with the "--skipunbiasing" flag, it runs successfully!
I've been using YANK (0.23.4) to simulate a protein-ligand system (6069 atoms and water box and around 100 replicas for both the complex and solvent phase), on a system with two GTX 1080Ti GPUs, and so far it's going great! However, when I want to analyze the simulation, I notice it takes much much more time than it takes to simulate the system itself! For example, for 1200 iterations, the YANK analyze log is as follows: " 2019-02-06 10:37:43,040: Reading energies... 2019-02-06 10:37:43,179: Done. 2019-02-06 10:37:43,183: Assembling effective timeseries... 2019-02-06 10:37:43,207: Done. 2019-02-06 10:37:43,211: Reading energies... 2019-02-06 10:37:43,352: Done. 2019-02-06 10:37:43,356: Assembling effective timeseries... 2019-02-06 10:37:43,380: Done. 2019-02-06 10:37:43,396: Reading energies... 2019-02-06 10:37:43,467: Done. 2019-02-06 10:37:43,470: Assembling effective timeseries... 2019-02-06 10:37:43,490: Done. 2019-02-06 10:37:43,491: Reading energies... 2019-02-06 10:37:43,559: Done. 2019-02-06 10:37:43,561: Assembling effective timeseries... 2019-02-06 10:37:43,581: Done. 2019-02-06 10:37:46,057: Reading energies... 2019-02-06 10:37:46,214: Done. 2019-02-06 10:37:46,218: Assembling effective timeseries... 2019-02-06 10:37:46,260: Done. 2019-02-06 10:37:46,288: Checking if we need to unbias the restraint... 2019-02-06 10:37:46,288: Trying to get radially symmetric restraint data... 2019-02-06 10:37:46,288: Retrieving end thermodynamic states... 2019-02-06 10:37:54,979: Isolating restraint force... 2019-02-06 10:37:56,414: Deep copying restraint force... 2019-02-06 10:37:56,416: Retrieving particle masses... 2019-02-06 10:37:56,416: Done. 2019-02-06 10:37:56,420: Found HarmonicRestraintForce restraint. The restraint will be unbiased. 2019-02-06 10:37:56,420: Computing restraint energies... 2019-02-10 12:55:54,293: Restraint energy mean: 1.0667663520787305 kT; std: 0.7953572883898831 kT 2019-02-10 12:55:54,294: Reading energies... 2019-02-10 12:55:54,563: Done. 2019-02-10 12:55:54,569: Assembling effective timeseries... 2019-02-10 12:55:54,623: Done. 2019-02-10 12:55:54,652: Assembling uncorrelated energies... 2019-02-10 12:55:54,955: Found expanded cutoff states in the energies! 2019-02-10 12:55:54,955: Free energies will be reported relative to them instead! 2019-02-10 12:55:54,956: Done. 2019-02-10 12:55:57,074: Done. 2019-02-10 12:55:57,080: Computing free energy differences... 2019-02-10 12:56:13,299: Done. 2019-02-10 12:56:13,300: Computing covariance matrix... (Followed by the Deltaf_ij matrices for the solvent and complex phases) " As you can see, computing the restraint energies took around 4 days. I noticed that the analysis is done on one CPU. Can I somehow run this analysis on GPU? Thanks!