How to calculate the statistical error of the equilibrium distribution obtained from the MSM?

lucl13 commented 4 years ago

Dear PyEmma users,

The highest eigenvalue of the transition matrix was 1, and its corresponding eigenfunction represented the equilibrium distribution. And it could be used for calculating the free energy of each state (G=-kbTlnpi). Further, after applying the PCCA+ algorithm, the stationary probability and free energy of the metastable states could be obtained. However, how to calculate the statistical error of them? Could you please offer some suggestion? Thanks in advance. (I am using the PyEmma 2.5.4).

Best, Chenlin

thempel commented 4 years ago

Hi Chenlin, you can use a Bayesian MSM for that. So you estimate a BMSM on your data and sample stationary distributions from it. As you've probably done for computing the metastable states' stationary probabilities, you sum over the stationary probability in a metastable set for each of those samples and get a confidence interval from that. Maybe the following code snipped helps.

pi_pcca_sample = np.zeros((bayesian_msm.n_metastable, bayesian_msm.nsamples))

pi_sample = bayesian_msm.sample_f('pi')

for j in range(bayesian_msm.nsamples):
    for i, s in enumerate(bayesian_msm.metastable_sets):
        pi_pcca_sample[i, j] = pi_sample[j][s].sum()

# plot
fig, ax = plt.subplots()
for i in range(bayesian_msm.n_metastable):
    ax.hist(pi_pcca_sample[i], alpha=.4)

    # confidence intervals
    print(pyemma.util.statistics.confidence_interval(pi_pcca_sample[i]))

lucl13 commented 4 years ago

Hi Tim, thank you for the quick reply.

markovmodel / PyEMMA

How to calculate the statistical error of the equilibrium distribution obtained from the MSM? #1452