Closed marionatf closed 8 years ago
Yes, remove the first command %pylab inline
. That command tells notebooks to not open the plots and instead plot them "inline".
It's already removed
You might need to restart the python session if you called it before.
How are you running the tutorial?
I'm running it as the following python script:
from htmd import *
sets = glob('datasets/*/')
sims = []
for s in sets:
fsims = simlist(glob(s + '/filtered/*/'), 'datasets/1/filtered/filtered.pdb')
sims = simmerge(sims, fsims)
metr = Metric(sims)
metr.projection(MetricDistance('protein and name CA', 'resname MOL and noh', metric='contacts'))
data = metr.project()
data.fstep = 0.1
data.plotTrajSizes()
data.dropTraj()
tica = TICA(data, 20)
dataTica = tica.project(3)
dataBoot = dataTica.bootstrap(0.8)
dataBoot.cluster(MiniBatchKMeans(n_clusters=1000), mergesmall=5)
model = Model(dataBoot)
model.plotTimescales()
model.markovModel(50, 5)
model.viewStates(ligand='resname MOL and noh')
kin = Kinetics(model, temperature=298, concentration=0.0037)
r = kin.getRates()
print(r)
kin.plotRates()
kin.plotFluxPathways()
You should be getting the plots. Does matplotlib
work fine on your computer, @marionatf?
If this script works, so should plotRates()
and plotFluxPathways
from matplotlib import pylab as plt
plt.figure()
plt.plot([1,2,3,4])
plt.show()
We still have problems to visualize plots from a python script as @marionatf reported. We have tried with the following changes in the script:
from htmd import *
import time
from IPython import get_ipython
get_ipython().magic('pylab auto')
htmd.config(viewer='vmd')
sims = simlist(glob('input/*/'), glob('input/*/structure.pdb'), glob('input/*/'))
fsims = simfilter(sims, './filtered/', filtersel='not water')
sims = simmerge(sims, fsims)
# Calculating metrics
metr = Metric(sims)
metr.projection(MetricDistance('protein and name CA', 'resname MOL and noh', metric='contacts'))
data = metr.project()
data.fstep = 0.1
# Removing trajectories
data.plotTrajSizes() #not working
# TICA
tica = TICA(data, 20)
dataTica = tica.project(3)
# Bootstrapping
dataBoot = dataTica.bootstrap(0.8)
# Clustering conformaitons
dataBoot.cluster(MiniBatchKMeans(n_clusters=1000), mergesmall=5)
# Building the Markov model
model = Model(dataBoot)
model.markovModel(50, 5)
model.plotTimescales() #not working
# Visualizing the states
model.viewStates(ligand='resname MOL and noh') # This gives problems
# Calculating the kinetics
kin = Kinetics(model, temperature=298, concentration=0.0037)
r = kin.getRates()
print(r)
kin.plotRates() #not working
time.sleep(6000)
However, plots are still imposible to visualize. A part from pylab 'auto' we also have tried with' tk' and did not work either.
Then, we have tried to run the script as a jupyter notebook from the GPU machines provided. Plots only work from jupyter notebook if a macbook opens the notebook in the GPU machines . Non of the other PC can open the jupyter from there. But at least in this case we could visualize the plots. However, the plot looked awful. So it seems that we also have a problem regarding our data (the trajectories that we have generated). Here I post a screenshot of the plot of the timescales and the error reported just after it.
1- Data Issue: Any idea of what is happening with the timescale plot?
2- Visualization Issue: The error reported below the plot is the same we have obtained when we were executing the program as a python script. At the beginning of the execution track seems that matplotlib might have something to do with our visualization issues but we have tried the commands from @j3mdamas latest post and they are working fine.
Error reported when executing as a python script:
Using matplotlib backend: Qt4Agg
Populating the interactive namespace from numpy and matplotlib
WARNING: pylab import has clobbered these variables: ['random']
`%matplotlib` prevents importing * from pylab and numpy
2016-06-22 13:52:20,286 - htmd.simlist - INFO - Starting listing of simulations.
Creating simlist: 100% (8/8) [###################################################################################################################################################] eta 00:01 -
2016-06-22 13:52:20,288 - htmd.simlist - INFO - Finished listing of simulations.
2016-06-22 13:52:21,097 - htmd.simlist - INFO - Starting filtering of simulations.
[Parallel(n_jobs=-2)]: Done 1 tasks | elapsed: 0.0s
[Parallel(n_jobs=-2)]: Batch computation too fast (0.0037s.) Setting batch_size=108.
[Parallel(n_jobs=-2)]: Done 9 out of 8 | elapsed: 0.0s remaining: -0.0s
[Parallel(n_jobs=-2)]: Done 9 out of 8 | elapsed: 0.0s remaining: -0.0s
[Parallel(n_jobs=-2)]: Done 9 out of 8 | elapsed: 0.0s remaining: -0.0s
[Parallel(n_jobs=-2)]: Done 9 out of 8 | elapsed: 0.0s remaining: -0.0s
[Parallel(n_jobs=-2)]: Done 9 out of 8 | elapsed: 0.0s remaining: -0.0s
[Parallel(n_jobs=-2)]: Done 9 out of 8 | elapsed: 0.0s remaining: -0.0s
[Parallel(n_jobs=-2)]: Done 9 out of 8 | elapsed: 0.0s remaining: -0.0s
[Parallel(n_jobs=-2)]: Done 8 out of 8 | elapsed: 0.0s finished
2016-06-22 13:52:21,295 - htmd.simlist - INFO - Finished filtering of simulations
2016-06-22 13:52:21,295 - htmd.projections.metric - INFO - Metric: Starting projection of trajectories.
[Parallel(n_jobs=-2)]: Done 1 tasks | elapsed: 1.8s
[Parallel(n_jobs=-2)]: Done 16 out of 16 | elapsed: 26.5s finished
2016-06-22 13:52:47,933 - htmd.projections.metric - INFO - Finished projecting the trajectories.
2016-06-22 13:52:47,934 - htmd.projections.metric - INFO - Frame step 0.0ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.
getting output of TICA: 100% (16/16) [###########################################################################################################################################] eta 00:00 |/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1279: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, init_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1300: RuntimeWarning: init_size=300 should be larger than k=1000. Setting it to 3*k
init_size=init_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:630: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, init_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:630: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, init_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:630: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, init_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
/home/student1/miniconda3/lib/python3.5/site-packages/sklearn/cluster/k_means_.py:1328: DeprecationWarning: This function is deprecated. Please call randint(0, 5999 + 1) instead
0, n_samples - 1, self.batch_size)
2016-06-22 13:52:58,292 - htmd.metricdata - INFO - Mergesmall removed 42 clusters. Original ncluster 49, new ncluster 7.
2016-06-22 13:52:58,851 - htmd.model - INFO - 100.0% of the data was used
2016-06-22 13:52:58,855 - htmd.model - INFO - Number of trajectories that visited each macrostate:
2016-06-22 13:52:58,855 - htmd.model - INFO - [ 4 5 7 5 12]
22-06-16 13:52:59 pyemma.msm.estimators.maximum_likelihood_msm.MaximumLikelihoodMSM[1] WARNING Ignored error during estimation: Active set is empty. Cannot estimate MSM.
2016-06-22 13:52:59,078 - pyemma.msm.estimators.maximum_likelihood_msm.MaximumLikelihoodMSM[1] - WARNING - Ignored error during estimation: Active set is empty. Cannot estimate MSM.
22-06-16 13:52:59 pyemma.msm.estimators.maximum_likelihood_msm.MaximumLikelihoodMSM[2] WARNING Ignored error during estimation: Active set is empty. Cannot estimate MSM.
2016-06-22 13:52:59,096 - pyemma.msm.estimators.maximum_likelihood_msm.MaximumLikelihoodMSM[2] - WARNING - Ignored error during estimation: Active set is empty. Cannot estimate MSM.
estimating MaximumLikelihoodMSM: 100% (26/26) [##################################################################################################################################] eta --:-- /22-06-16 13:52:59 pyemma.msm.estimators.implied_timescales.ImpliedTimescales[1] WARNING Estimation has failed at lagtimes: [458 499]. Run single-lag estimation at these lags to track down the error.
2016-06-22 13:52:59,204 - pyemma.msm.estimators.implied_timescales.ImpliedTimescales[1] - WARNING - Estimation has failed at lagtimes: [458 499]. Run single-lag estimation at these lags to track down the error.
22-06-16 13:52:59 pyemma.msm.estimators.implied_timescales.ImpliedTimescales[1] WARNING Changed user setting nits to the number of available timescales nits=6
2016-06-22 13:52:59,239 - pyemma.msm.estimators.implied_timescales.ImpliedTimescales[1] - WARNING - Changed user setting nits to the number of available timescales nits=6
22-06-16 13:52:59 pyemma.msm.estimators.implied_timescales.ImpliedTimescales[1] WARNING Some timescales could not be computed. Timescales array is smaller than expected or contains NaNs
2016-06-22 13:52:59,239 - pyemma.msm.estimators.implied_timescales.ImpliedTimescales[1] - WARNING - Some timescales could not be computed. Timescales array is smaller than expected or contains NaNs
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
/home/student1/isentis/analysis2.py in <module>()
35
36 # Visualizing the states
---> 37 model.viewStates(ligand='resname MOL and noh') # This gives problems
38
39 # Calculating the kinetics
/home/student1/miniconda3/lib/python3.5/site-packages/htmd/model.py in viewStates(self, states, statetype, protein, ligand, viewer, mols, numsamples, wrapsel, alignsel)
402 states = [states]
403 if mols is None:
--> 404 mols = self.getStates(states, statetype, numsamples=numsamples, wrapsel=wrapsel, alignsel=alignsel)
405 colors = [0, 1, 3, 4, 5, 6, 7, 9]
406 for i, s in enumerate(states):
/home/student1/miniconda3/lib/python3.5/site-packages/htmd/model.py in getStates(self, states, statetype, wrapsel, alignsel, alignmol, samplemode, numsamples)
329 refmol = None
330 if not single:
--> 331 raise NameError('Visualizer does not support yet visualization of systems with different number of atoms')
332 if alignmol is None:
333 alignmol = molfile
NameError: Visualizer does not support yet visualization of systems with different number of atoms
(sorry for the long post)
So beyond the plotting issue which I have no clue, you have two problems with the timescales and the visualizer error.
simfilter
. That should give you fixed number of atoms for all of your systems and fix the error.actually, for trypsin you should be able to produce decent results in two days of runs
On 22 June 2016 at 14:52, Stefan notifications@github.com wrote:
So beyond the plotting issue which I have no clue, you have two problems with the timescales and the visualizer error.
1.
Fixing the visualizer error. Your systems have different number of atoms, therefore the visualizer cannot load them into VMD. What you should do is filter the waters out of the systems with simfilter https://www.htmd.org/docs/htmd.simlist.html#htmd.simlist.simfilter . That should fix it. 2.
Horrible timescales. Well, you have 8 trajectories 😄 Markov models need looooooots of data to work well for complex ligand binding systems (hundreds or thousands of trajectories of multiple nanoseconds). Don't expect to get good timescales with the resources you were given, otherwise we would have cured cancer by now 😛 . Practically I doubt you can even build a Markov model with that data. As the PyEMMA logger says, there were no states left in the active set, meaning that your states at the given lag-times were disconnected.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Acellera/htmd/issues/67#issuecomment-227733268, or mute the thread https://github.com/notifications/unsubscribe/AHaqOs6qTf9ne_7JvgqjnNYHkzDoqnOwks5qOTAIgaJpZM4I4beX .
<https://twitter.com/acellera>
https://www.youtube.com/user/acelleracom https://www.linkedin.com/company/2133167?trk=tyah&trkInfo=clickedVertical%3Acompany%2CclickedEntityId%3A2133167%2Cidx%3A2-1-2%2CtarId%3A1448018583204%2Ctas%3Aacellera https://www.acellera.com/md-simulation-blog-news/ http://is.gd/1eXkbS
We are using:
md = AdaptiveRun()
md.nmin=6
md.nmax=8
md.nepochs = 12
with 5 different systems initialized with docking (5 folders of generators). Is this ok? Is there any better/faster way? Should we increase the number of epochs?
These settings depend on how many GPUs you have dedicated to you. 5 generators is okay I guess. I doubt you have reached 12 epochs yet.
We have 4 GPUs. And, by the way, to stop the run...should we do Control+C ? Because we have understood that it will run forever until it is stopped. Should we killed it when the 12 epoch folders are present?
Thank you all for your answers, we know some of them are really basic.
At 12 epochs it should stop by itself. To kill it you can Ctrl+C but this will only kill the adaptive, not the currently running simulations which you might have to kill from the process manager or alternatively by killing the python session (if you are using AcemdLocal).
About the visualization error saying we have different number of atoms, we think we are already filtering the waters out, but still we get the error. Maybe we are doing it wrong, we are using:
sims = simlist(glob('input/*/'), glob('input/*/structure.pdb'), glob('input/*/'))
fsims = simfilter(sims, './filtered/', filtersel='not water')
sims = simmerge(sims, fsims)
Yes, you should not merge the filtered with the unfiltered simulations. That beats the point of using the filtered simulations. So just use the fsims.
Actually the plotting issue was caused by some code in HTMD.
For the moment because we won't make a new release soon find please go to the folder of your conda installation and find the file
lib/python3.5/site-packages/htmd/parameterize/__init__.py
And remove the lines
import matplotlib as mpl
mpl.use('Agg')
Hello, We are trying to follow the tutorial "ligand binding analysis" as a python script, but we can't use the methods that create plots (plotTrajSizes, plotTimescales, getRates): when we call them, an empty window appears, without any plot in it. Is there a way to obtain these plots without using python notebook? Thanks, Mariona