Open MamtaMohan opened 7 years ago
This has been on our lists, but we hadn't had the time to finish it yet. @fabian-paul has implemented rate matrix estimator using the Kalbfleisch-Lawless method in a pull request here: https://github.com/markovmodel/msmtools/pull/57 But I'm not sure what the status is, Fabian can you comment? Maybe having the estimator in msmtools would already help you, Mamta? My feeling is that this would be easy to do.
I think it would also be relatively straightforward to wire it into PyEMMA - we'd need to implement a Model class (MasterEquationModel or ContinuousTimeMSM) that is very similar to the current MSM class but has a few changes with respect to how the eigenvalues are handled. I think we could re-use all MSM tests. The estimator would be similar to MaximumLikelihoodMSM and would just call the msmtools estimator. Probably much of the code in MSM and MaximumLikelihoodMSM can be generalized to be used in both cases. My guess is that this is a few days of coding for a good PyEMMA developer. Fabian, have you already done any of that or did you exclusively work with the low-level method?
Am 01/10/16 um 20:45 schrieb MamtaMohan:
Hi,
Is it possible to generate continuous time MSM with pyEMMA.
Mamta
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/markovmodel/PyEMMA/issues/951, or mute the thread https://github.com/notifications/unsubscribe-auth/AGMeQiQzYIDN25SJtosp4Urq4kIe3AOGks5qvwyigaJpZM4KL6L3.
Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin
Phone: (+49) (0)30 838 75354 Web: research.franknoe.de
Thank you.
Appreciate your response.
I will give a try.
Mamta
Hi Mamta, hi Frank,
the continuous time estimator that is implemented in the pull request markovmodel/msmtools#57 is functional. To use the estimator as quick as possible, you could replace your version of msmtools with the version from my branch: https://github.com/fabian-paul/msmtools/tree/ratematrix This version is quite up-to-date with the development branch.
The rate-matrix estimator is available after importing msmtools as
msmtools.estimation.estimate_rate_matrix
It is not integrated with Pyemma yet, estimate_rate_matrix takes a (numpy) count matrix as input and returns a (numpy) rate-matrix.
Integrating this in Pyemma will require a bit of work, as we would have to provide the implied time-scale plot, the Chapman-Kolmogorov test and Transition Path Theory expressed with the rate matrix as well. I'm would be pleased if we could add these features. However I'm quite busy right now so I would prefer to implement only the functionality that is needed right now.
I vigorously recommend that you carry out the Chapman-Kolmogorov-test with the estimated rate matrix because is it not guaranteed a priori that a continuous time MSM even exists for your data.
So I think it would be a sensible goal to implement everything in Pyemma that is needed for a continuous time MSM CK-test.
Mamta, to you need any specific functionality for the continuous time MSM?
Best, Fabian
most of these additional features (Implied Timescales and CKTest) would come for free if an Estimator and Model class are implemented, so I think it's less work than you suggest, but of course I see you can't do it right now.
Can we merge the PR in msmtools to have the basic functionality and discuss the rest in Berlin?
Am 03/10/16 um 04:19 schrieb fabian-paul:
Hi Mamta, hi Frank,
the continuous time estimator that is implemented in the pull request markovmodel/msmtools#57 https://github.com/markovmodel/msmtools/pull/57 is functional. To use the estimator as quick as possible, you could replace your version of msmtools with the version from my branch: https://github.com/fabian-paul/msmtools/tree/ratematrix This version is quite up-to-date with the development branch.
The rate-matrix estimator is available after importing msmtools as |msmtools.estimation.estimate_rate_matrix| It is not integrated with Pyemma yet, estimate_rate_matrix takes a (numpy) count matrix as input and returns a (numpy) rate-matrix. Integrating this in Pyemma will require a bit of work, as we would have to provide the implied time-scale plot, the Chapman-Kolmogorov test and Transition Path Theory expressed with the rate matrix as well. I'm would be pleased if we could add these features. However I'm quite busy right now so I would prefer to implement only the functionality that is needed right now. I vigorously recommend that you carry out the Chapman-Kolmogov-test with the estimated rate matrix because is it not guaranteed a priori that a continuous time MSM even exists for your data. So I think it would be a sensible goal to implement everything in Pyemma that is needed for a continuous time MSM CK-test. Mamta, to you need any specific functionality for the continuous time MSM?
Best, Fabian
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/markovmodel/PyEMMA/issues/951#issuecomment-251061756, or mute the thread https://github.com/notifications/unsubscribe-auth/AGMeQhLmSuhZ98cV9AhjmCAEYo2f7QwAks5qwMi7gaJpZM4KL6L3.
Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin
Phone: (+49) (0)30 838 75354 Web: research.franknoe.de
I see. If we implement it as a Model with .timescales() and .propagate() we get much for free.
Yes the PR is ready to merge. It was on halt for two reasons:
I agree. Let's go ahead, we can always swap out code if useful, but for now it's important to have the functionality.
Am 03/10/16 um 14:26 schrieb fabian-paul:
I see. If we implement it as a Model with .timescales() and .propagate() we get much for free.
Yes the PR is ready to merge. It was on halt for two reasons:
- We were waiting for my PR on scipy to make it into the next release. That has happened and I have removed all duplicate code that now is in scipy from the PR.
- We wanted to clarify whether my or Robert's approach works better. We didn't sort this out yet. But I don't think this should stop us from merging. If Robert's method turns out the be better, we can change that later.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/markovmodel/PyEMMA/issues/951#issuecomment-251201373, or mute the thread https://github.com/notifications/unsubscribe-auth/AGMeQvShWgiDGHWoj_G7EC9zf7x8wA2Pks5qwVbNgaJpZM4KL6L3.
Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin
Phone: (+49) (0)30 838 75354 Web: research.franknoe.de
Matma, the msmtools code is merged. You can find the low-level functionality in msmtools.estimation.rate_matrix
if you install from github. If you want to use the released version, wait a few days until the new msmtools release is cut.
@MamtaMohan you can now use msmtools-1.2 for rate matrix estimation
Thank you Frank.
I will wait few days.
Mamta
From: Frank Noe notifications@github.com Sent: Friday, October 21, 2016 1:24:29 PM To: markovmodel/PyEMMA Cc: Mohan, Mamta; Author Subject: Re: [markovmodel/PyEMMA] Continuous time MSM (#951)
Matma, the msmtools code is merged. You can find the low-level functionality in msmtools.estimation.rate_matrix if you install from github. If you want to use the released version, wait a few days until the new msmtools release is cut.
You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/markovmodel/PyEMMA/issues/951#issuecomment-255429878, or mute the threadhttps://github.com/notifications/unsubscribe-auth/APcc_Yc_ztnFxcOdhhqRSFaXdHwGnTCPks5q2PVNgaJpZM4KL6L3.
the binaries the released version are also available ;)
just do a conda update msmtools
Thank you Martin.
I was waiting for conda release.
Appreciate it.
Mamta
From: Martin K. Scherer notifications@github.com Sent: Tuesday, October 25, 2016 3:01:19 PM To: markovmodel/PyEMMA Cc: Mohan, Mamta; Mention Subject: Re: [markovmodel/PyEMMA] Continuous time MSM (#951)
the binaries the released version are also available ;)
just do a conda update msmtools
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/markovmodel/PyEMMA/issues/951#issuecomment-256142554, or mute the threadhttps://github.com/notifications/unsubscribe-auth/APcc_VyrADVjBXMD7eEYHR1lg5rxNMpuks5q3lH_gaJpZM4KL6L3.
Dear Fabian,
I was caught up with things so writing to after a delay.
I am new to lot of things so please be patient with me.
I just want basic Continuous Markov Model, functionality for it there now.
Please correct me where ever you think I am off.
To make Continuous Markov Model, Using PentaPeptide test files and tutorial and following script I build the model up to the point to calculate Implied timescales Bayesian Markov state model estimator, which give estimation of msm-lag time. Then discrete Markov model is build by call: M = msm.estimate_markov_model(dtrajs, msm_lag) I understand here you would like me to use new rate matrix, first by importing msmtools.
But Isn't msmtools is already loaded with PyEmma?
As per you suggested : The rate-matrix estimator is available after importing msmtools as msmtools.estimation.estimate_rate_matrix
I tried [83]: import msmtools as msmtools.estimation.estimate_rate_matrix
File "
import msmtools as MT
AttributeError Traceback (most recent call last)
Hi,
you may need to update msmtools: conda install msmtools
then your import statement should work as this: from msmtools.estimation import estimate_rate_matrix and use as estimate_rate_matrix(args)
or import msmtools and use as msmtools.estimation import estimate_rate_matrix()
Am 16/11/16 um 02:50 schrieb MamtaMohan:
import msmtools as msmtools.estimation.estimate_rate_matrix
Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin
Phone: (+49) (0)30 838 75354 Web: research.franknoe.de
Dear Frank,
I will try your suggestion again as you mentioned.
I updated PyEMMA before I started via conda update and following packages were updated:
package | build |
---|---|
progress_reporter-1.2 | py_1 14 KB acellera |
conda-4.2.12 | py27_0 373 KB |
msmtools-1.2 | np111py27_0 1.2 MB acellera |
pyemma-2.2.7 | np111py27_0 1.3 MB acellera |
------------------------------------------------------------
Total: 2.8 MB
The following NEW packages will be INSTALLED:
progress_reporter: 1.2-py_1 acellera
The following packages will be UPDATED:
conda: 4.2.7-py27_0 --> 4.2.12-py27_0
msmtools: 1.1.3-np111py27_0 http://conda.binstar.org/omnia --> 1.2-np111py27_0 acellera
pyemma: 2.2.3-np111py27_0 http://conda.binstar.org/omnia --> 2.2.7-np111py27_0 acellera
then I guess you should have an msmtools that includes the feature
Am 16/11/16 um 13:43 schrieb MamtaMohan:
Dear Frank,
I will try your suggestion again as you mentioned.
I updated PyEMMA before I started via conda update and following packages were updated:
package build progress_reporter-1.2 py_1 14 KB acellera conda-4.2.12 py27_0 373 KB msmtools-1.2 np111py27_0 1.2 MB acellera pyemma-2.2.7 np111py27_0 1.3 MB acellera
|------------------------------------------------------------ Total: 2.8 MB |
The following NEW packages will be INSTALLED:
|progress_reporter: 1.2-py_1 acellera |
The following packages will be UPDATED:
|conda: 4.2.7-py27_0 --> 4.2.12-py27_0 msmtools: 1.1.3-np111py27_0 http://conda.binstar.org/omnia --> 1.2-np111py27_0 acellera pyemma: 2.2.3-np111py27_0 http://conda.binstar.org/omnia --> 2.2.7-np111py27_0 acellera |
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/markovmodel/PyEMMA/issues/951#issuecomment-260937178, or mute the thread https://github.com/notifications/unsubscribe-auth/AGMeQqZBb7283L1ntQ7A8X-FQ4KRPiUEks5q-vp3gaJpZM4KL6L3.
Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin
Phone: (+49) (0)30 838 75354 Web: research.franknoe.de
Hi Mamta,
The comments above are a bit out-dated. We renamed the method in msmtools to msmtools.estimation.rate_matrix
.
Could you give that a try?
Also msmtools doens't work with dtrajs directly (like pyemma). You first have to estimate a count matrix with msmtools.estimation.count_matrix
and restrict that to its connected set with msmtools.estimation.largest_connected_submatrix
. That submatrix has to be given to msmtools.estimation.rate_matrix
as the first argument.
Best, Fabian
ok.
Let me give it a try and let me get back to you.
On 11/16/2016 03:46 AM, Frank Noe wrote:
Hi,
you may need to update msmtools: conda install msmtools
then your import statement should work as this: from msmtools.estimation import estimate_rate_matrix The function is just called rate_matrix, in analogy to the function transition_matrix.
To easily get a correct (connected count matrix) you can first estimate an msm via pyemma and re-use the count_matrix_active property:
msmtools.estimation.rate_matrix(msm.count_matrix_active)
Yes, correct.
Note that estimation of the rate matrix on a fine clustering that MSMs are built upon is very expensive and not necessarily very meaningful. It's a better idea to use a HMM and then do rate matrix estimation on the resulting macroscopic count matrix.
Am 17/11/16 um 13:40 schrieb Martin K. Scherer:
On 11/16/2016 03:46 AM, Frank Noe wrote:
Hi,
you may need to update msmtools: conda install msmtools
then your import statement should work as this: from msmtools.estimation import estimate_rate_matrix The function is just called rate_matrix, in analogy to the function transition_matrix.
To easily get a correct (connected count matrix) you can first estimate an msm via pyemma and re-use the count_matrix_active property:
msmtools.estimation.rate_matrix(msm.count_matrix_active)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/markovmodel/PyEMMA/issues/951#issuecomment-261236995, or mute the thread https://github.com/notifications/unsubscribe-auth/AGMeQito0es5i0knR86mO07yvLGWEQlHks5q_EtNgaJpZM4KL6L3.
Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin
Phone: (+49) (0)30 838 75354 Web: research.franknoe.de
Thank you.
Hopefully I will be able to get this today or tomorrow and I will get back to you.
Sorry for the delay.
This is what I understand from the discussion:
For Continuous Markov Model: After I build discrete Markov Model via PyEMMA I should first build HMM as Frank suggested. I did make HMM for 4 states for Penta-peptide but I realized As Martim suggested for rate_matrix I imported msmtools as: import msmtools as msmtools I checked Markov model which is in M. M.count_matrix_active give me result: In [126]: M.count_matrix_active Out[126]: array([[ 6.91231723, 0. , 0. , ..., 1.97494778,
Just listing part of output. In [129]: M.count_matrix_active.shape Out[129]: (250, 250) Finally to get rate_matrix: In [130]: MMC = msmtools.estimation.rate_matrix(M.count_matrix_active) ...... AssertionError: Commands ends with AssertionError. I think it is this function: def function_and_gradient(self, x): 343 if self.sparsity is None: --> 344 assert np.all(x >= 0) 345 else: 346 assert np.all(x > 0)
If you can help me fix this I would appreciate it.
Hi Mamta,
apologies for the long delay. I had some time to dig into this error only now. I created a pull request with a tentative fix on https://github.com/markovmodel/msmtools/pull/98
There seems to be some internal problem of the l_bfgs_b routine as it apparently can violate the constraints that are imposed during the minimization of the likelihood.
The fix would be to install a new version of msmtools from github once the pull request is merged and call rate_matrix
with the option on_error='warn'
. This will turn the error that you get into a warning. Hopefully then the l_bfgs_b minimizer will recover from overshooting the bounds in a subsequent iteration, s. t. the result will be ok.
I know that this might not be the solution you expect. However there are still some numerical difficulties with the maximum-likelihood rate matrix estimator. The gradient of the likelihood requires an element-wise division by the transition matrix which leads to a bad condition. In a previous commit, I had already implemented the ad-hoc fix that was proposed by Pande and McGibbon for this problem in DOI:10.1063/1.4926516 This might already solve the problem.
Fabian
Dear Fabian,
Thank you for your response.
I will try the fix.
If you want I can wait a bit.
Mamta
From: fabian-paul notifications@github.com Sent: Monday, January 02, 2017 1:01:10 PM To: markovmodel/PyEMMA Cc: Mohan, Mamta; Mention Subject: Re: [markovmodel/PyEMMA] Continuous time MSM (#951)
Hi Mamta,
apologies for the long delay. I had some time to dig into this error only now. I created a pull request with a tentative fix on markovmodel/msmtools#98https://github.com/markovmodel/msmtools/pull/98 There seems to be some internal problem of the l_bfgs_b routine as it apparently can violate the constraints that are imposed during the minimization of the likelihood. The fix would be to install a new version of msmtools from github once the pull request is merged and call rate_matrix with the option on_error='warn'. This will turn the error that you get into a warning. Hopefully then the l_bfgs_b minimizer will recover from overshooting the bounds in a subsequent iteration, s. t. the result will be ok. I know that this might not be the solution you expect. However there are still some numerical difficulties with the maximum-likelihood rate matrix estimator. The gradient of the likelihood requires an element-wise division by the transition matrix which leads to a bad condition. In a previous commit, I had already implemented the ad-hoc fix that was proposed by Pande and McGibbon for this problem in DOI:10.1063/1.4926516 This might already solve the problem.
Fabian
- You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/markovmodel/PyEMMA/issues/951#issuecomment-270003023, or mute the threadhttps://github.com/notifications/unsubscribe-auth/APcc_amL0pwjv89JitiSJqpoipU-4QVSks5rOTtmgaJpZM4KL6L3.
Hi,
Is it possible to generate continuous time MSM with pyEMMA.
Mamta