msmbuilder / msmexplorer

Data visualizations for biomolecular dynamics
http://msmbuilder.org/msmexplorer/
MIT License
17 stars 17 forks source link

Interpretation of plot_pop_resids #94

Closed jeiros closed 7 years ago

jeiros commented 7 years ago

I'm building an MSM on the internal dynamics of a ligand, which I think should be well sampled within microseconds of simulation. I can see 'clean' jumps in my tIC time evolution, but when the pop_resids plot is looking very different from the one in your documentation.

download

What kind of information can I extract out of msme.plot_pop_resids? I've never seen this plot in a publication.

msultan commented 7 years ago
jeiros commented 7 years ago

Hi @msultan thanks for your answer. So if I understood you correctly, you would expect to see a completely decorrelated cloud of points (in the case of large sampling where the MSM populations and the MD populations match)? What exactly are the residuals? From my plot above, it looks like the Raw Populations axis is fairly homogeneously distributed. But with a strong correlation on the Residuals (?)

msultan commented 7 years ago

Exactly, for any given microstate, its population would ~ msm population, leading to a gaussian cloud around 0 i think the residual is np.log10(MSM)-np.log10(raw counts). so a difference of 1 is that something like 10x more. However, its important to note that for lowly sampled populations this might not be significant. 0.003 vs 0.03 is 10x but is hardly worth worrying about. Similarly 0.003 vs 0.0003 is the same in the opposite direction but again nothing to worry about.

@cxhernandez do i have the residual formula right?

msultan commented 7 years ago

hmm, maybe we should incorporate @jadeshi's code for doing bootstrapping here somehow.

cxhernandez commented 7 years ago

Yeah, basically you're observing how the MSM has corrected your populations. An ideal plot (with complete sampling) would have small decorrelated fluctuations in the residuals. Here, it seems like you have a set of microstates which are probably undersampled but, like @msultan said, this is probably not an issue if the MFTPs look reasonable.

@cxhernandez do i have the residual formula right?

Yup!