msmbuilder / msmbuilder-legacy

Legacy release of MSMBuilder
http://msmbuilder.org
GNU General Public License v2.0
25 stars 28 forks source link

CalculateImpliedTimeScale failed with "k must be between 1 and ndim(A)-1" #420

Open Serilin opened 10 years ago

Serilin commented 10 years ago

Hi, everyone I'm working with my system using msmbuilder2.8.2. The cluster process has succeed with the commond " msmb Cluster rmsd hybrid -d 0.045 -l 50" . But the command "CalculateImpliedTimeScales -l 1,25 -i 1 -o Data/ImpliedTimeScales.dat" failed, telling the following {'assignments': 'Data/Assignments.h5', 'eigvals': 10, 'interval': 1, 'lagtime': '1,25', 'notrim': False, 'output': 'Data/ImpliedTimescales.dat', 'procs': 1, 'quiet': False, 'symmetrize': 'MLE'} 09:04:17 - Getting 10 eigenvalues (timescales) for each lagtime... 09:04:17 - Building MSMs at the following lag times: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25] 09:04:17 - Calculating implied timescales at lagtime 1 09:04:17 - Selected component 58 with population 0.080616 09:04:18 - BFGS likelihood maximization terminated after 17 function calls. Initial and final log likelihoods: -79.490250, -78.928102. 09:04:18 - You cannot calculate 11 eigenvectors from a 5 x 5 matrix. 09:04:18 - Instead, calculating 5 eigenvectors. 09:04:18 - Calculating implied timescales at lagtime 8 09:04:18 - Selected component 72 with population 0.056490 Traceback (most recent call last): File "/home/tim/anaconda/bin/msmb", line 9, in load_entry_point('msmbuilder==2.8.2', 'console_scripts', 'msmb')() File "/home/tim/anaconda/lib/python2.7/site-packages/msmbuilder-2.8.2-py2.7.egg/msmbuilder/scripts/msmb.py", line 51, in entry_point getattr(scripts, args.subparser_name).entry_point() File "/home/tim/anaconda/lib/python2.7/site-packages/msmbuilder-2.8.2-py2.7.egg/msmbuilder/scripts/CalculateImpliedTimescales.py", line 85, in entry_point (not args.notrim), args.symmetrize, args.procs) File "/home/tim/anaconda/lib/python2.7/site-packages/msmbuilder-2.8.2-py2.7.egg/msmbuilder/scripts/CalculateImpliedTimescales.py", line 67, in run trimming=trimming, symmetrize=symmetrize, n_procs=nProc) File "/home/tim/anaconda/lib/python2.7/site-packages/msmbuilder-2.8.2-py2.7.egg/msmbuilder/msm_analysis.py", line 237, in get_implied_timescales lags = result.get(999999) File "/home/tim/anaconda/lib/python2.7/multiprocessing/pool.py", line 554, in get raise self._value ValueError: k must be between 1 and ndim(A)-1 09:04:18 - BFGS likelihood maximization terminated after 13 function calls. Initial and final log likelihoods: -82.038017, -75.611886. 09:04:18 - You cannot calculate 11 eigenvectors from a 3 x 3 matrix. 09:04:18 - Instead, calculating 3 eigenvectors.

I has tried the cluster with ’-d 0.2” and ‘-d 0.1“, and both failed. I don't know why.

I noted that when I ran the alanine dipeptide tutorial of msmbuilder2.8, "msmb CalculateImpliedTimescales -l 1,25 -i 1 -o Data/ImpliedTimescales.dat" gave out the 07:17:18 - Calculating implied timescales at lagtime 24 07:17:18 - Selected component 0 with population 1.000000 07:17:19 - BFGS likelihood maximization terminated after 157 function calls. Initial and final log likelihoods: -163578.487091, -163577.602213. 07:17:19 - Calculating implied timescales at lagtime 25 07:17:19 - Selected component 0 with population 1.000000 07:17:20 - BFGS likelihood maximization terminated after 185 function calls. Initial and final log likelihoods: -163322.252035, -163321.354107. 07:17:20 - Abnormal termination of BFGS likelihood maximization. Error code 2 07:17:20 - Saved output to Data/ImpliedTimescales.dat This tutorial is successful in my computer, so why my system was wrong ? Anyone can give some suggestion?

ps My trajectory has not solvent and was saved every 5 ps for total 200 ns. How should I define the number of min and max lagtime? Is it alway from 1 to 25?

mpharrigan commented 10 years ago

The distance cutoff in the clustering step is likely too strict. This means it is probably making way too many clusters and a really disconnected transition matrix.

09:04:18 - Selected component 72 with population 0.056490
Traceback (most recent call last):

This line means that 95% of your data is being thrown away. You're then left with too few states, and the sparse eigensolver can't get the eigenvalues.

Instead of using the -d option, try specifying the number of clusters directly with the -k flag.

Serilin commented 10 years ago

Mpharrigan, Thanks for your answer. According to your suggetion, we have tested to produce 200, 100 and 50 clusters with -k flag, respectively, and the cluster process has no problem. However, the same question stil appears when coming to "CalculateImpliedTimeScales -l 1,25 -i 1 -o Data/ImpliedTimeScales.dat".

{'assignments': 'Data/Assignments.h5', 'eigvals': 10, 'interval': 1, 'lagtime': '1,25', 'notrim': False, 'output': 'Data/ImpliedTimescales.dat', 'procs': 1, 'quiet': False, 'symmetrize': 'MLE'} 01:24:02 - Getting 10 eigenvalues (timescales) for each lagtime... 01:24:02 - Building MSMs at the following lag times: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25] 01:24:02 - Calculating implied timescales at lagtime 1 01:24:02 - Selected component 0 with population 0.700740 01:24:02 - BFGS likelihood maximization terminated after 24 function calls. Initial and final log likelihoods: -347.627745, -347.247757. 01:24:02 - You cannot calculate 11 eigenvectors from a 6 x 6 matrix. 01:24:02 - Instead, calculating 6 eigenvectors. 01:24:02 - Calculating implied timescales at lagtime 8 01:24:02 - Selected component 0 with population 0.701723 Traceback (most recent call last): File "/home/tim/anaconda/bin/msmb", line 9, in load_entry_point('msmbuilder==2.8.2', 'console_scripts', 'msmb')() File "/home/tim/anaconda/lib/python2.7/site-packages/msmbuilder-2.8.2-py2.7.egg/msmbuilder/scripts/msmb.py", line 51, in entry_point getattr(scripts, args.subparser_name).entry_point() File "/home/tim/anaconda/lib/python2.7/site-packages/msmbuilder-2.8.2-py2.7.egg/msmbuilder/scripts/CalculateImpliedTimescales.py", line 85, in entry_point (not args.notrim), args.symmetrize, args.procs) File "/home/tim/anaconda/lib/python2.7/site-packages/msmbuilder-2.8.2-py2.7.egg/msmbuilder/scripts/CalculateImpliedTimescales.py", line 67, in run trimming=trimming, symmetrize=symmetrize, n_procs=nProc) File "/home/tim/anaconda/lib/python2.7/site-packages/msmbuilder-2.8.2-py2.7.egg/msmbuilder/msm_analysis.py", line 237, in get_implied_timescales lags = result.get(999999) File "/home/tim/anaconda/lib/python2.7/multiprocessing/pool.py", line 554, in get raise self._value ValueError: k must be between 1 and ndim(A)-1

It's always prompt " k must be between 1 and ndim(A)-1", and I'm confused.

Seriline

mpharrigan commented 10 years ago

Hi Seriline, It still looks like you don't end up with enough states

01:24:02 - Calculating implied timescales at lagtime 1
01:24:02 - Selected component 0 with population 0.700740
01:24:02 - You cannot calculate 11 eigenvectors from a 6 x 6 matrix.
01:24:02 - Instead, calculating 6 eigenvectors.

At lagtime=1, you're trimming 30% of your data (which is not great, but doesn't necessarily lead to a problem)... but after the trimming step you are only left with 6 states. It crashes at lagtime=8, which means at that lagtime you are probably left with < ~3 states

This could be because all of your sampling is concentrated in a small region of conformational space. Perhaps @rmcgibbo or @schwancr could give their opinion, but it looks like you just need more sampling.

Matt

Serilin commented 10 years ago

Matt, Appreciate for your quick reply. Actually, the 200 ns trajectory was a single long trajectory. That is, it comes from one original conformation. Does it mean that this trajectory isn't proper for MSM analysis, right ? If not, is there anyway to increase the samping? Seriline