markovmodel / PyEMMA

🚂 Python API for Emma's Markov Model Algorithms 🚂
http://pyemma.org
GNU Lesser General Public License v3.0
307 stars 118 forks source link

Transition Matrix Singular #1458

Closed toth12 closed 4 years ago

toth12 commented 4 years ago

My transition matrix is singular; when I am passing it to msm.markov_model() I can create the model but then I would like to use the msm.tpt() function, which drops the error "numpy.linalg.LinAlgError: Matrix is singular."

Is there any workaround here? Is it ok that my transition matrix is singular or did I make a mistake when constructing the transition matrix? Thanks!

thempel commented 4 years ago

There are some mathematical conditions which need to be fulfilled for transition matrices of Markov processes that are handled in PyEMMA. How do you construct the transition matrix? I'd be surprised if the singularity of your transition matrix was indeed a feature of the underlying physics, at least if we are talking about "standard" MD data.

toth12 commented 4 years ago

@thempel Thanks for the answer; I use pyemma for a project not related to physics. ( I use Markov Chains to study survivor testimonies from Auschwitz-Birkenau). As to singularity of transition matrix, I found that once I remove those states from which or to which the transition probability is zero, the transition matrix is not singular anymore. Is this an OK solution to deal with the singularity problem? Btw, I would be happy to ping you guys developing Pyemma to discuss how it would be possible to integrate methods of physics into the study of narrative aspects of textual data (i.e not NLP); I have already used your pathway decomposition, which brought very good results.

thempel commented 4 years ago

That sounds like an exciting project! Sure, just pass me an email.

One of the most important conditions on the transition matrix for standard MSMs is that it is reversibly connect, i.e. that for every state in the transition matrix there must be at least one transition out of it and into it. As you described, you can remove the states that are not connected from your data and analyze the transition matrix on the largest connected subset of states. In fact that is done automatically if you estimate your transition matrix using pyemma. There are functions for creating such a matrix from a disconnected one in msmtools.estimation if you need it.

toth12 commented 4 years ago

@thempel many thanks, please close this issue, I will pass you an email in the next days.