deeptime-ml / deeptime

Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation
GNU Lesser General Public License v3.0
747 stars 82 forks source link

Cross Validation score for MaximumLikelihoodMSMs and BayesianMSMs #293

Closed prateekbansal97 closed 3 months ago

prateekbansal97 commented 3 months ago

Hello deeptime developers,

I would like to request a pyemma-style cross validation score for scoring MSMs (MaximumLikelihoodMSM, BayesianMSM), which was a useful tool in pyemma to plot the errors in VAMP score.

An implementation in pyemma looked like:


If not as a feature, I would like guidance as to how to calculate the scores with the current implementation.

P.S. Your tools are highly useful in general, thanks for the nice implementation!.


clonker commented 3 months ago

Cheers, you are right, I have never added an example regarding that! My bad! For the time being, you can check this notebook:

The relevant bit is this:

from deeptime.decomposition import TICA, vamp_score_cv

fig, axes = plt.subplots(1, 3, figsize=(12, 3), sharey=True)
labels = ['backbone\ntorsions', 'heavy Atom\ndistances']
tica_estimator = TICA(lagtime=lags[0], dim=dim)

for ax, lag in zip(axes.flat, lags):
    tica_estimator.lagtime = lag
    torsions_scores = vamp_score_cv(tica_estimator, trajs=bbtorsions, blocksplit=False, n=3)
    scores = [torsions_scores.mean()]
    errors = [torsions_scores.std()]
    distances_scores = vamp_score_cv(tica_estimator, trajs=heavy_atom_distances, blocksplit=False, n=3)
    scores += [distances_scores.mean()]
    errors += [distances_scores.std()], scores, yerr=errors, color=['C0', 'C1', 'C2'])
    ax.set_title(r'lag time $\tau$={}ps'.format(lag))

axes[0].set_ylabel('VAMP2 score')

You can provide an estimated MSM and/or bayesian MSM as well.


prateekbansal97 commented 3 months ago


Thanks for the reply. I was able to implement the suggestion.