Discarding initial samples to burn-in

jchodera commented 9 years ago

We probably want a scheme to automatically discard initial BHMM samples to burn-in. One way to do this would be to record the log-likelihood of the BHMM posterior and then use automated equilibration detection to discard initial samples to equilibrium.

franknoe commented 9 years ago

The posterior is unimodal, we start from the maximum likelihood, and for every sample we generates a number of Gibbs steps before using it. I would be surprised if this is an issue here. Probably we are making way too many steps given the sizes of our matrices. But if there's an easy way to check, why not.

Am 18/05/15 um 01:56 schrieb John Chodera:

We probably want a scheme to automatically discard initial BHMM samples to burn-in. One way to do this would be to record the log-likelihood of the BHMM posterior and then use automated equilibration detection https://github.com/choderalab/pymbar/blob/master/pymbar/timeseries.py#L710-L775 to discard initial samples to equilibrium.

— Reply to this email directly or view it on GitHub https://github.com/bhmm/bhmm/issues/38.

Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354 Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

jchodera commented 9 years ago

The posterior is unimodal,

Not necessarily. There's permutation symmetry if we don't enforce an ordering on the state means, and even then, I'm not certain it is unimodal.

we start from the maximum likelihood,

That doesn't mean we can get rid of burn-in---in fact, it means that we might be starting relatively far from a "typical sample" from the posterior if it is broad.

and for every sample we generates a number of Gibbs steps before using it.

This is certainly helpful, but we do the same thing in many MD simulations, and we still have to discard to burn-in or run a very long time.

I would be surprised if this is an issue here. Probably we are making way too many steps given the sizes of our matrices. But if there's an easy way to check, why not.

If we can just compute the log Bayesian posterior for each model, that would be an easily quantity to examine for the model timeseries!

franknoe commented 9 years ago

OK. Do you want to do this or should I look into it?

Does a decorrelated log Bayesian posterior mean that other observables are decorrelated as well? I guess we can have quite different effective decorrelation times in different observables.

Am 18/05/15 um 06:11 schrieb John Chodera:

The posterior is unimodal,
Not necessarily. There's permutation symmetry if we don't enforce an ordering on the state means, and even then, I'm not certain it is unimodal.
we start from the maximum likelihood,
That doesn't mean we can get rid of burn-in---in fact, it means that we might be starting relatively far from a "typical sample" from the posterior if it is broad.
and for every sample we generates a number of Gibbs steps before
using it.
This is certainly helpful, but we do the same thing in many MD simulations, and we still have to discard to burn-in or run a very long time.
I would be surprised if this is an issue here. Probably we are
making way too many steps given the sizes of our matrices. But if
there's an easy way to check, why not.
If we can just compute the log Bayesian posterior for each model, that would be an easily quantity to examine for the model timeseries!

— Reply to this email directly or view it on GitHub https://github.com/bhmm/bhmm/issues/38#issuecomment-102916683.

Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354 Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

jchodera commented 9 years ago

OK. Do you want to do this or should I look into it?

I'm not quite sure where all the bits get calculated at this point, so if it is easier for you to compute the log-likelihood for the sampled BHMM models, I can focus on the burn-in analysis.

Does a decorrelated log Bayesian posterior mean that other observables are decorrelated as well? I guess we can have quite different effective decorrelation times in different observables.

Other observables can certainly have different correlation times, but the correlation time of the log posterior is certainly a lower bound on the slowest relaxation/mixing time of the BHMM sampler chain. It's what I would consider "due diligence" for sampling.

bhmm / legacy-bhmm-force-spectroscopy-manuscript

Discarding initial samples to burn-in #38

Mail: Arnimallee 6, 14195 Berlin, Germany

Mail: Arnimallee 6, 14195 Berlin, Germany