pyemma.msm.MSM with disconnected input

cwehmeyer commented 9 years ago

Currently, pyemma.msm.MSM(P) does not support a disconnected transition matrix P as input. Enabling this feature would be tremendously helpful for the TRAM-related classes where we have multiple transition matrices which might not all share the same connectivity.

Is there any reason to not support pyemma.msm.MSM(P) with disconnected P?

franknoe commented 9 years ago

Let me think about this a bit. Supporting disconnected state spaces has always been an aim for the MSM estimator classes, we just hadn't had time to do it. I see the TRAM requirement, but if we support disconnected matrices, we have to work for all applications, especially also in the case that we have only one thermodynamic state (standard MSM). Currently many of the functions in MSM assume connectivity and a unique stationary distribution. So building this feature in will either require a lot of tinkering in many functions, or a general way of handling this situation. Let me give this some thought.

Am 01/10/15 um 11:17 schrieb Christoph Wehmeyer:

Currently, |pyemma.msm.MSM(P)| does not support a disconnected transition matrix |P| as input. Enabling this feature would be tremendously helpful for the TRAM-related classes where we have multiple transition matrices which might not all share the same connectivity.

Is there any reason to not support |pyemma.msm.MSM(P)| with disconnected |P|?

— Reply to this email directly or view it on GitHub https://github.com/markovmodel/PyEMMA/issues/571.

Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354 Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

cwehmeyer commented 9 years ago

We don't need to keep a disconnected transition matrix within an MSM object. What I have in mind is this: the MSM constructor runs msmtools.estimation.largest_connected_set(), stores the result as self.active_set, and stores the result of msmtools.largest_connected_submatrix() as new transition matrix.

That way, we don not need to modify any methods of the MSM class, but we keep the information on the active set where we need it when using multistate models.

franknoe commented 9 years ago

OK. That may work, but I need to think about whether this will have some undesired side-effects.

Am 01/10/15 um 16:09 schrieb Christoph Wehmeyer:

We don't need to keep a disconnected transition matrix within an MSM object. What I have in mind is this: the MSM constructor runs |msmtools.estimation.largest_connected_set()|, stores the result as |self.active_set|, and stores the result of |msmtools.largest_connected_submatrix()| as new transition matrix.

That way, we don not need to modify any methods of the MSM class, but we keep the information on the active set where we need it when using multistate models.

— Reply to this email directly or view it on GitHub https://github.com/markovmodel/PyEMMA/issues/571#issuecomment-144738568.

Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354 Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

franknoe commented 9 years ago

Actually... isn't that what already happens? If I understand you correctly, current MSM estimators such as MaximumLikelihoodMSM do exactly this - they compute the largest connected set, estimate the transition matrix on this set, and construct an MSM with it.

So if you wanna handle it this way in TRAM, this is a job of the TRAM estimator, and you just have to store MSMs on different connected subsets for the individual thermodynamic states. Still not sure if that's the way to go because there is additional equilibrium information available for the nonconnected states.

Am 01/10/15 um 16:09 schrieb Christoph Wehmeyer:

We don't need to keep a disconnected transition matrix within an MSM object. What I have in mind is this: the MSM constructor runs |msmtools.estimation.largest_connected_set()|, stores the result as |self.active_set|, and stores the result of |msmtools.largest_connected_submatrix()| as new transition matrix.

That way, we don not need to modify any methods of the MSM class, but we keep the information on the active set where we need it when using multistate models.

— Reply to this email directly or view it on GitHub https://github.com/markovmodel/PyEMMA/issues/571#issuecomment-144738568.

Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354 Web: research.franknoe.de

markovmodel / PyEMMA