Small negative element in the transition matrix derived by a iVAMPnet

Toolxx commented 1 year ago

After I have established my iVAMPnet model, I want to further analyze the Markov state model. However, after I use the "get_transition_matrix" function in the ivampnet.py file, the transition matrix I export has negative elements. This prevents me from importing this model into the markov state model class of the deeptime or pyemma. So, I want to ask what is the reason for this? And how can I solve this issue?

My simple attempt is to change the small negative elements to 0 and adjust the tolerance, but this will cause the situation that the stationary distribution cannot be computed.

The code and running results of some of my attempts are as follows

###### for getting transition matrix#######
runs = 1
its = [[] for _ in range(runs)]
# cheap error estimation, instead of retraining chi, evaluate the model on different trajectories
percentage = 0.9
N_trajs = len(dataset.trajectories)
indexes_traj = np.arange(N_trajs)
n_val = int(N_trajs * percentage)
msmlags=np.array([1,2,4,6,10])*10
for run in range(runs):
    for tau_i in msmlags:
        np.random.shuffle(indexes_traj)
        indexes_used = indexes_traj[:n_val]
        data_t = np.concatenate([dataset.trajectories[a][:-tau_i] for a in indexes_used], axis=0)
        data_tau = np.concatenate([dataset.trajectories[a][tau_i:] for a in indexes_used], axis=0)
        its[run].append(model.get_transition_matrix(data_t, data_tau, tau_i))

import deeptime
M1 = its[0][0][0]
M2 = its[0][0][1]
print(M1)
[[ 9.49290754e-01  1.60446438e-02  8.37745994e-06  5.19685298e-05
   1.39199456e-02  2.08420478e-02 -1.50450741e-04 -7.28687583e-06]
 [ 1.23066523e-02  9.79575337e-01  7.85644991e-07  4.50083067e-06
   3.01935416e-03  5.11962238e-03 -2.40801244e-05 -2.17225217e-06]
 [-5.39265869e-06  1.72809747e-06  9.68056977e-01  5.94817227e-03
  -9.64218534e-05 -4.14264763e-05  1.27529423e-02  1.33834210e-02]
 [ 1.21388451e-04 -1.82461938e-05  1.27561552e-02  9.80846433e-01
  -5.51637302e-05 -4.99394022e-05 -2.76337492e-03  9.16274808e-03]
 [ 9.17717767e-03  2.66629201e-03  1.27990989e-04 -2.00515502e-05
   9.59202570e-01  4.84249423e-03  2.40054294e-02 -1.90260192e-06]
 [ 1.72821701e-02  1.23635389e-03 -4.36310589e-05  4.23389062e-06
   1.15082018e-02  9.76157797e-01 -6.14568176e-03  5.55559834e-07]
 [-6.40585829e-05 -1.20257732e-05  1.61217341e-02  4.04417520e-03
   1.33054843e-02  5.81842332e-03  9.60451890e-01  3.34377496e-04]
 [-6.31064549e-07  3.11938316e-06  1.24935301e-02  9.44161279e-03
   2.49919144e-06  1.48224235e-06  5.60188061e-05  9.78002369e-01]]

print(sum(M1.T))
[1. 1. 1. 1. 1. 1. 1. 1.]
MSM = deeptime.markov.msm.MarkovStateModel(M1,transition_matrix_tolerance=1e-2)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_26840\2001138124.py in <module>
----> 1 MSM = deeptime.markov.msm.MarkovStateModel(M1,transition_matrix_tolerance=1e-2)

c:\Users\Lenovo\anaconda3\lib\site-packages\deeptime\markov\msm\_markov_state_model.py in __init__(self, transition_matrix, stationary_distribution, reversible, n_eigenvalues, ncv, count_model, transition_matrix_tolerance, lagtime)
     78         self._lagtime = lagtime
     79 
---> 80         self.update_transition_matrix(transition_matrix)
     81 
     82         if n_eigenvalues is None:

c:\Users\Lenovo\anaconda3\lib\site-packages\deeptime\markov\msm\_markov_state_model.py in update_transition_matrix(self, value)
    245                     raise
    246             if not msmana.is_transition_matrix(value, tol=self.transition_matrix_tolerance):
--> 247                 raise ValueError(f'The input transition matrix is not a stochastic matrix '
    248                                  f'(elements >= 0, rows sum up to 1) up to '
    249                                  f'tolerance {self._transition_matrix_tolerance}.')

ValueError: The input transition matrix is not a stochastic matrix (elements >= 0, rows sum up to 1) up to tolerance 0.01.

M1[M1 <= 0] = 0

print(M1)
[[9.49290754e-01 1.60446438e-02 8.37745994e-06 5.19685298e-05
  1.39199456e-02 2.08420478e-02 0.00000000e+00 0.00000000e+00]
 [1.23066523e-02 9.79575337e-01 7.85644991e-07 4.50083067e-06
  3.01935416e-03 5.11962238e-03 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 1.72809747e-06 9.68056977e-01 5.94817227e-03
  0.00000000e+00 0.00000000e+00 1.27529423e-02 1.33834210e-02]
 [1.21388451e-04 0.00000000e+00 1.27561552e-02 9.80846433e-01
  0.00000000e+00 0.00000000e+00 0.00000000e+00 9.16274808e-03]
 [9.17717767e-03 2.66629201e-03 1.27990989e-04 0.00000000e+00
  9.59202570e-01 4.84249423e-03 2.40054294e-02 0.00000000e+00]
 [1.72821701e-02 1.23635389e-03 0.00000000e+00 4.23389062e-06
  1.15082018e-02 9.76157797e-01 0.00000000e+00 5.55559834e-07]
 [0.00000000e+00 0.00000000e+00 1.61217341e-02 4.04417520e-03
  1.33054843e-02 5.81842332e-03 9.60451890e-01 3.34377496e-04]
 [0.00000000e+00 3.11938316e-06 1.24935301e-02 9.44161279e-03
  2.49919144e-06 1.48224235e-06 5.60188061e-05 9.78002369e-01]]
print(sum(M1.T))
[1.00015774 1.00002625 1.00014324 1.00288672 1.00002195 1.00618931
 1.00007608 1.00000063]

M = deeptime.markov.msm.MarkovStateModel(M1,transition_matrix_tolerance=1)
print(M)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_26840\3717809300.py in <module>
      1 M = deeptime.markov.msm.MarkovStateModel(M1,transition_matrix_tolerance=1)
----> 2 print(M)

c:\Users\Lenovo\anaconda3\lib\site-packages\deeptime\base.py in __repr__(self)
     15         name = '{cls}-{id}:'.format(id=id(self), cls=self.__class__.__name__)
     16         return '{name}{params}]'.format(
---> 17             name=name, params=pprint_sklearn(self.get_params(), offset=len(name), )
     18         )
     19 

c:\Users\Lenovo\anaconda3\lib\site-packages\deeptime\base.py in get_params(self, deep)
     48                                % (cls,))
     49         for arg in args:
---> 50             params[arg] = getattr(self, arg, None)
     51         return params
     52 

c:\Users\Lenovo\anaconda3\lib\site-packages\deeptime\util\decorators.py in __get__(self, instance, owner)
     23         value = self.cache.get(instance, self._default_cache_entry)
     24         if value is self._default_cache_entry:
---> 25             value = self.fget(instance)
     26             self.cache[instance] = value
...
--> 178             raise ValueError("Input matrix is not a transition matrix. "
    179                              "Cannot compute stationary distribution")
    180         if not is_connected(T, directed=False):

ValueError: Input matrix is not a transition matrix. Cannot compute stationary distribution

amardt commented 1 year ago

Hi, great progress! You actually have to be lucky that you can pipe the transition matrix directly into the deeptime package. A Vampnet only guarantees you a matrix, whose entries add up to 1 within each row. However, negative entries are not excluded, because it is still a Koopman model, where the matrix is the best propagator for the state assignments. In the case of a normal MSM the state assignments are hard (instead of the softmax it is either 1 or 0), this will make the Koopman matrix a real transition matrix. In case of VAMPnets, usually the harder the assignment of states, the more likely the matrix will be a real transition matrix. There are advanced method built on top of VAMPnets which will guarantee real transition matrix and/or reversible models. One could combine these with the iVAMPnet approach. However, we have not tried it yet. Now as a first step I would love to know which post analysis you want to do? Because some of them you can still do with the negative entries, but others you cant.

As a second point, you already found a way how to force the matrix to be non negative afterwards. Setting them to zero is a very rough way, but if you want to do it, you have to normalize the rows again, that they all sum up to 1. Something like: T = T/T.sum(1, keepdims=True) Another way is to simply make the soft assignment hard ones. So assign the state with a argmax and then put it into the deeptime package: chi_hard = np.argmax(chi, axis=1). This will give you a discrete trajectory and you can use the normal MSM construction scheme. If you want to stick to soft assignments, I think you will get better results if you optimize a transition matrix to be a good propagator of the state assignments: ||chi(x_t+tau) - T^T chi(x_t)||, where you parametrize T and optimize the function I have provided while enforcing the elements of T to be all positive and each row to sum up to 1.

I hope this helps. Let me know, how it goes!

Andreas

Toolxx commented 1 year ago

Hi Andreas,

I just realized I forgot to reply to this message. Regarding this part of the my study, after your suggestion, it has continued fordwards. Thank you so much for your help, what you've said has helped my research a lot!

I am now continuing to try to work on the system I am interested in based on your iVAMPnet model. My main interest is how to estimate a reasonable MSM instead of the Koopman model. Other than directly derive the biophysical properties, I need to use MSM in some other tasks. Specifically, I need to estimate the MSM here for different states and then go ahead and compare the existing Markov models in the literature that have been built by other means (generally mathematical modeling, which start from assumptions then build the model other than statistical estimation from MD). Or, based on the MSM estimated to be exported by iVAMPnet, go ahead and bring in the Hodgkin–Huxley model (like Fig3 of Hempel, 2021). Because of the constrains of the above research objectives, probably I would prefer to estimate MSM based on iVAMPnet rather than directly using iVAMPnet's Koopman model.

One more aspect I would like to ask for further advice is: which is more guaranteed in your opinion, deriving discrete trajectories from iVAMPnet's Koopman model and then estimating the MSM, or estimating the MSM with soft assignment setting constraints in iVAMPnet? Or do they each have advantages and disadvantages?

Thanks.

markovmodel / ivampnets

Small negative element in the transition matrix derived by a iVAMPnet #4