[BUG] get_reversible_eigenvectors

rmcgibbo commented 10 years ago

get_reversible_eigenvectors returns the wrong eigenvectors and eigenvalues for this matrix

Here's the code for my test:

import numpy as np
import scipy.linalg
import scipy.sparse.linalg
from msmbuilder.msm_analysis import get_reversible_eigenvectors

transmat = np.loadtxt('transmat.txt')
transmat = scipy.sparse.csr_matrix(transmat)

u, pi = scipy.sparse.linalg.eigs(transmat.T, k=1)
pi = pi / pi.sum()

for i in range(transmat.shape[0]):
    for j in range(transmat.shape[1]):
        # make sure that transmat _is_ reversible
        if np.abs((pi[i] * transmat[i, j]) - (pi[j] * transmat[j, i])) > 1e-15:
            raise ValueError('sdfsdf')

def is_eigenvector(mat, vec):
    ratio = vec / mat.dot(vec)
    # all of the elements in ratio should be
    # equal (and be the eigenvalue)
    if np.std(ratio) > 1e-10:
        return False, 'min(ratio)=%.4f. max(ratio)=%4f' % (np.min(ratio), np.max(ratio))
    return True

for k in [1,2,3,4]:
    u1, v1 = get_reversible_eigenvectors(transmat, k=k, right=True)
    u2, v2 = scipy.sparse.linalg.eigs(transmat, k=k)

    for i in range(k):
        print 'v1[:, %d]' % i, is_eigenvector(transmat, v1[:, i])
        print 'v2[:, %d]' % i, is_eigenvector(transmat, v2[:, i])

    tr1 = np.trace(v1.T.dot(transmat.dot(v1)) / v1.T.dot(v1))
    tr2 = np.trace(v2.T.dot(transmat.dot(v2)) / v2.T.dot(v2))

    print 'sum of u1', np.sum(u1)
    print 'sum of u2', np.sum(np.real_if_close(u2))
    print 'trace quotient 1', tr1
    print 'trace quotient 2', np.real_if_close(tr2)
    print

And the results:

$ python test-eigenpairs.py
v1[:, 0] (False, 'min(ratio)=0.8842. max(ratio)=1.109632')
v2[:, 0] True
sum of u1 1.0
sum of u2 1.0
trace quotient 1 1.0
trace quotient 2 1.0

v1[:, 0] (False, 'min(ratio)=0.8842. max(ratio)=1.109632')
v2[:, 0] True
v1[:, 1] (False, 'min(ratio)=-38.6817. max(ratio)=117.824294')
v2[:, 1] True
sum of u1 1.06935384109
sum of u2 1.06935384109
trace quotient 1 1.06935384109
trace quotient 2 1.06935384109

v1[:, 0] (False, 'min(ratio)=0.8842. max(ratio)=1.109632')
v2[:, 0] True
v1[:, 1] (False, 'min(ratio)=-38.6817. max(ratio)=117.824294')
v2[:, 1] True
v1[:, 2] (False, 'min(ratio)=-85.2639. max(ratio)=27.491009')
v2[:, 2] True
sum of u1 1.13733143645
sum of u2 1.13733143645
trace quotient 1 1.13733143645
trace quotient 2 1.13733143645

v1[:, 0] (False, 'min(ratio)=0.8842. max(ratio)=1.109632')
v2[:, 0] True
v1[:, 1] (False, 'min(ratio)=-38.6817. max(ratio)=117.824294')
v2[:, 1] True
v1[:, 2] (False, 'min(ratio)=-85.2639. max(ratio)=27.491009')
v2[:, 2] True
v1[:, 3] (False, 'min(ratio)=-629.1803. max(ratio)=477.349582')
v2[:, 3] True
sum of u1 1.20082647393
sum of u2 1.07221895542
trace quotient 1 1.20082647393
trace quotient 2 1.07221895542

rmcgibbo commented 10 years ago

IMO this could be relatively serious. I haven't yet looked in the code to try to track down where the bug is coming from.

kyleabeauchamp commented 10 years ago

Here is my output:

v1[:, 0] (False, 'min(ratio)=0.8842. max(ratio)=1.109632')
v2[:, 0] True
sum of u1 1.0
sum of u2 1.0
trace quotient 1 1.0
trace quotient 2 1.0

v1[:, 0] (False, 'min(ratio)=0.8842. max(ratio)=1.109632')
v2[:, 0] True
v1[:, 1] (False, 'min(ratio)=-38.6817. max(ratio)=117.824294')
v2[:, 1] True
sum of u1 1.06935384109
sum of u2 0.934887518971
trace quotient 1 1.06935384109
trace quotient 2 0.934887518971

v1[:, 0] (False, 'min(ratio)=0.8842. max(ratio)=1.109632')
v2[:, 0] True
v1[:, 1] (False, 'min(ratio)=-38.6817. max(ratio)=117.824294')
v2[:, 1] True
v1[:, 2] (False, 'min(ratio)=-85.2639. max(ratio)=27.491009')
v2[:, 2] True
sum of u1 1.13733143645
sum of u2 1.13733143645
trace quotient 1 1.13733143645
trace quotient 2 1.13733143645

v1[:, 0] (False, 'min(ratio)=0.8842. max(ratio)=1.109632')
v2[:, 0] True
v1[:, 1] (False, 'min(ratio)=-38.6817. max(ratio)=117.824294')
v2[:, 1] True
v1[:, 2] (False, 'min(ratio)=-85.2639. max(ratio)=27.491009')
v2[:, 2] True
v1[:, 3] (False, 'min(ratio)=-629.1803. max(ratio)=477.349582')
v2[:, 3] True
sum of u1 1.20082647393
sum of u2 1.20082647393
trace quotient 1 1.20082647393
trace quotient 2 1.20082647393

schwancr commented 10 years ago

v1 is the right eigenvector, so you should check if the vector is an eigenvector of transmat.T which it is:

EDIT : So it is in fact an eigenvector of the transpose, so it seems that get_reversible_eigenvector is doing right / left incorrectly... But the right eigenvector should be an eigenvector of transmat not transmat.T I had it mixed up.

vspm24:test_rev_eig schwancr$ cat test.py
import numpy as np
import scipy.linalg
import scipy.sparse.linalg
from msmbuilder.msm_analysis import get_reversible_eigenvectors

transmat = np.loadtxt('transmat.txt')
transmat = scipy.sparse.csr_matrix(transmat)

u, pi = scipy.sparse.linalg.eigs(transmat.T, k=1)
pi = pi / pi.sum()

for i in range(transmat.shape[0]):
    for j in range(transmat.shape[1]):
        # make sure that transmat _is_ reversible
        if np.abs((pi[i] * transmat[i, j]) - (pi[j] * transmat[j, i])) > 1e-15:
            raise ValueError('sdfsdf')

def is_eigenvector(mat, vec):
    ratio = vec / mat.dot(vec)
    # all of the elements in ratio should be
    # equal (and be the eigenvalue)
    if np.std(ratio) > 1e-10:
        return False, 'min(ratio)=%.4f. max(ratio)=%4f' % (np.min(ratio), np.max(ratio))
    return True

for k in [1,2,3,4]:
    u1, v1 = get_reversible_eigenvectors(transmat, k=k, right=True)
    u2, v2 = scipy.sparse.linalg.eigs(transmat, k=k)

    for i in range(k):
        print 'v1[:, %d]' % i, is_eigenvector(transmat.T, v1[:, i])
        print 'v2[:, %d]' % i, is_eigenvector(transmat, v2[:, i])

    tr1 = np.trace(v1.T.dot(transmat.dot(v1)) / v1.T.dot(v1))
    tr2 = np.trace(v2.T.dot(transmat.dot(v2)) / v2.T.dot(v2))

    print 'sum of u1', np.sum(u1)
    print 'sum of u2', np.sum(np.real_if_close(u2))
    print 'trace quotient 1', tr1
    print 'trace quotient 2', np.real_if_close(tr2)
    print

vspm24:test_rev_eig schwancr$ python test.py 
v1[:, 0] True
v2[:, 0] True
sum of u1 1.0
sum of u2 1.0
trace quotient 1 1.0
trace quotient 2 1.0

v1[:, 0] True
v2[:, 0] True
v1[:, 1] True
v2[:, 1] True
sum of u1 1.06935384109
sum of u2 1.06935384109
trace quotient 1 1.06935384109
trace quotient 2 1.06935384109

v1[:, 0] True
v2[:, 0] True
v1[:, 1] True
v2[:, 1] True
v1[:, 2] True
v2[:, 2] True
sum of u1 1.13733143645
sum of u2 1.13733143645
trace quotient 1 1.13733143645
trace quotient 2 1.13733143645

v1[:, 0] True
v2[:, 0] True
v1[:, 1] True
v2[:, 1] True
v1[:, 2] True
v2[:, 2] True
v1[:, 3] True
v2[:, 3] True
sum of u1 1.20082647393
sum of u2 1.07221895542
trace quotient 1 1.20082647393
trace quotient 2 1.07221895542

schwancr commented 10 years ago

Also if you use which='LR' with scipy.sparse.linalg.eigs then you get the same eigenvectors each time. But get_reversible_eigenvectors seems to be returning the left eigenvectors when it should return the right

schwancr commented 10 years ago

Ok, so it's not as bad as you think. msm_analysis.get_reversible_eigenvectors takes the kwarg right but never uses it. So the eigenvectors are always the left eigenvectors regardless of whether you ask for the right ones.

Good news is, left is default anyway so most of msmbuilder wouldn't be affected. And likely the only people ever requesting right=True are in this thread

schwancr commented 10 years ago

Wait, I'm wrong, but also right. Only the left eigenvectors are being returned by get_reversible_eigenvectors. But I don't have the bugfix for getting the right eigenvectors.

msmbuilder / msmbuilder-legacy

[BUG] get_reversible_eigenvectors #412