Closed rmcgibbo closed 10 years ago
Agreed. I think there may have been issues in dinosaur versions of scipy, but things are probably good now.
cc @schwancr. I will assign this to myself though.
Here's a demonstration that the procedure works. I'm thinking about how to change the API for the PR currently.
import numpy as np
import scipy.sparse
import scipy.sparse.linalg
from msmbuilder.MSMLib import build_msm
def test_reversible_transmat_eigs_via_symmetric_transform():
# just some random counts
N = 100
counts = np.random.randint(1, 10, size=(N,N))
transmat, pi0 = build_msm(scipy.sparse.csr_matrix(counts), 'MLE')[1:3]
# Get the left eigenvectors using the general sparse eigensolver
w, v = scipy.sparse.linalg.eigs(transmat.T.todense())
v, w = v[:, np.argsort(w)], np.sort(w)
# Get the left eigenvectors using the sparse hermetian eigensolver
# on the symmemtrized transition matrix
root_pi = pi0**(0.5)
root_pi_diag = scipy.sparse.diags(root_pi, offsets=0).tocsr()
root_pi_diag_inv = scipy.sparse.diags(1.0 / root_pi, offsets=0).tocsr()
symtrans = root_pi_diag.dot(transmat).dot(root_pi_diag_inv)
# the eigenvalues are preserved by the similarity transform, but the
# eigenvectors need to be rotated
ws, vs = scipy.sparse.linalg.eigsh(symtrans.T.todense())
vs, ws = vs[:, np.argsort(ws)], np.sort(ws)
# the eigenvectors of symtrans are a rotated version of the eigenvectors
# of transmat that we want
vs = (root_pi * vs.T).T
# check that the eigenvalues from the hermetian eigensolver match
np.testing.assert_array_almost_equal(w, ws)
# confirm that the vs are actually left eigenvectors of transmat
for i in range(vs.shape[1]):
#print (transmat.T.dot(vs2[:, i]) / vs2[:, i]).flatten(), ws[i]
np.testing.assert_array_almost_equal(
(transmat.T.dot(vs[:, i]) / vs[:, i]).flatten(), np.ones(N)*ws[i])
if __name__ == '__main__':
test_reversible_transmat_eigs_via_symmetric_transform()
The API issue is that doing this calculation efficiently requires that you pass in the stationary distribution. If it's not passed in and you only have the transmat, you have to compute it with a call to scipy.sparse.linalg.eigs(transmat.T, k=1)
, which is really a waste of a call. So I guess the get_eigenvectors
signature needs to be extended to take populations
as a kwarg so that it doesn't need to be recomputed.
PS: we should also probably check (via matrix multiply) that the input equilibrium distribution is indeed stationary.
Also, your test code does not pass on my box. I presume it's a eigenvalue ordering problem...
Arrays are not almost equal to 6 decimals
(mismatch 16.6666666667%)
x: array([-0.07163043+0.j, -0.07006220+0.j, 0.06159592+0.j, 0.06688165+0.j,
0.07095556+0.j, 1.00000000+0.j])
y: array([-0.07163043, -0.0700622 , -0.06434376, 0.06688165, 0.07095556, 1. ])
Which line was that failure at?
PS: we should also probably check (via matrix multiply) that the input equilibrium distribution is indeed stationary.
Good idea.
So apparently the test failure only happens about 33% of the time.
np.testing.assert_array_almost_equal(w, ws)
The test of the eigenvectors at the bottom is also a check of the eigenvalues, so maybe that line can just be removed. There might be something wierd with the sorting (maybe complex dtype sorts wierdly?), but the ws
and vs
are definitely left eigenpairs of transmat.
Merged.
Currently, we're only using the general sparse eigensolver. There are relatively frequent problems like convergence issues and stuff, especially with many-thousand-state models.
Reversible transition matrices are similar (and thus share eigenvalues) to a symmetric matrix, T=Diag(sqrt(pi)) S Diag(1/sqrt(pi)). This means we should be able to do the transform and then use the hermetian eigensolver, which has much better numerical stability.