Fix nonconverged transition matrix

fabian-paul commented 8 years ago

This is a sequel to PR #73. This adds a function (due to Hao) that "fixes" almost converged transition matrices. Detailed balance with respect to the old stationary distribution of the matrix is conserved. Please tell me what you think.

fabian-paul commented 8 years ago

Hi folks, actually I was hoping for a bit of discussion before this was merged. Is everyone ok with "fixing" almost-converged transition matrices?

franknoe commented 8 years ago

You are right. I have completely overlooked this. I'm not really sure what you are doing here and why this is needed. In particular:

What does "fix row normalization" mean? Fix to what, or fix which sort of problem?
Why is reversible an unused option if it is to be used with the reversible estimator?
Isn't it easier to fix things at the level of the X matrix that is used in the reversible estimator?

fabian-paul commented 8 years ago

First of all this PR should contain only commit 745d308a140bf9cc4556de1724f0fdb53be59cbd, the rest should already have been merged. 1) Fixing the row normalization does what it says, it ensures that sum_j p_ij = 1 for all i. In Pyemma, the user has the option to limit the number of iterations of the reversible estimator and still work with the non-converged matrix. When the matrix is not converged, it is usually not row-stochastic. This leads to problems, like follow-up warnings in Pyemma. Also, the stationary distribution of a non-converged matrix sometimes can't be estimated. Here comes in the option of "fixing" the row-normalization. 2) I added the reversible option for future use, because I suspect that there might be many ways to fix a transition matrix. This algorithm which is implemented here assumes that the off-diagonal elements of the non-converged transition matrix already fulfill detailed balance with respect to the correct stationary distribution. Therefore only the diagonal-elements are corrected and the matrix is multiplied by a global scalar. This is in line with how the reversible estimator works: as a by-product it gives an estimate of the stationary distribution. The T matrix is fixed in such a way that its stationary distribution coincides with the one form the reversible estimator. Of course, the assumption that the stationary distribution is kind of correct may be wrong. This algorithm doesn't make much sense, when the T matrix doesn't come from the reversible estimator. So the parameter reversible leaves room for other algorithms. 3) Do you mean something like computing T_ij = (X_ij + X_ji) / sum_j (X_ij+X_ji) ? This would be another option. This matrix wouldn't have the stationary distribution form the reversible estimator. So one would have to fix the pair (T, pi).

franknoe commented 8 years ago

1) I don't understand how it could be non-row stochastic as a result of the number of iterations. There are two options - In the dense estimator you iterate X and then get T by direct row-normalization:

xsumnew = np.sum(X, axis=1)
...
xsum = xsumnew
...
T = correct_transition_matrix(X / xsum[:, np.newaxis])

which is equivalent to

T = correct_transition_matrix(X / X.sum(axis=1)[:, None])

So how could that be further corrected, unless in pathogological cases where X.sum(axis=1) is zero?
The second option is to iterate pi and then insert into the MLE equation for P given pi. At this point one could get a matrix that is distinctly non-row-stochastic due to numerical errors, but this would be independent of the number of iterations used to obtain the estimate for pi 2) OK, understood. 3) I guess this is related to 1. If you work with X and not with pi, isn't just row-normalization the same as "fixing"?

markovmodel / msmtools

Fix nonconverged transition matrix #74