TuringLang / MCMCDiagnosticTools.jl

https://turinglang.org/MCMCDiagnosticTools.jl/dev
Other
19 stars 6 forks source link

New ESS estimators #125

Open Red-Portal opened 3 months ago

Red-Portal commented 3 months ago

Hi!

I am looking into maybe contributing more ESS estimators to MCMCDiagnosticTools.jl, but it seems the current organization of the package is quite tightly coupled with the current ess_rhat.jl file. So, I am not sure how to proceed. In particular, I am interesting in working into the two following estimators:

  1. Shape constrained autocorrelation estimator (https://arxiv.org/abs/2207.12705)
  2. The Vats-Flegal-Jones estimator (https://arxiv.org/abs/1507.08266)

1 shows quite impressive variance reductions in estimating the autocorrelation of reversible chains, while 2 is an estimator also applicable to non-reversible chains, which the current package does not have. (I also think the docs would be better if it made it clear the current ess_rhat routines specifically assume reversibility.)

What would be the best course of action here? Should I make a new file for each estimator and add them first, and look into integration later?

sethaxen commented 4 weeks ago

Thanks for the issue! I discussed these methods with @avehtari, and here was the summary:

I also think the docs would be better if it made it clear the current ess_rhat routines specifically assume reversibility

Are there popular non-reversible MCMC methods?

Red-Portal commented 4 weeks ago

Hi!

Method (1) is not better than Geyer's method and (at least in private communication with Aki from February of last year) had unresolved computational issues. I think for us to consider including it, we'd first need confirmation that those issues have been resolved and second need a comprehensive benchmark (a la https://avehtari.github.io/rhat_ess/ess_comparison.html) comparing with Geyer's method.

Interesting, I would be very happy to see any experimental results on this if Aki has any. (1)'s approach assumes more things and intuitively, I would expect that to work better since it simply exploits more information about the problem.

Method (2) extends the old spectral approach (inferior to Geyer's method) to the multivariate case, but it's not the first attempt. Previous attempts never caught on. All, including some unpublished and Method (2) fail on heavy tailed distributions. The recommendation if including would be to change the lag window approach to Geyer's and to first comprehensively benchmark the resulting method to rstar, which also makes no CLT assumption.

I intended to point out the spectral approach rather than the multivariate version of that. Mostly because the spectral approach does not assume reversibility, which leads to the next answer:

Are there popular non-reversible MCMC methods?

I have a good one: Gibbs sampling with systematic scan. Arguably, it's the most popular way to implement Gibbs samplers, which is the one provided by Turing. There are also less popular but certainly important emerging non-reversible methods like PDMP-based MCMC algorithms, unadjusted methods, HMC with persistent momentum, and some others. SliceSampling.jl also has a few slice samplers that operate in an extended state space, so the chain is not reversible on the target space alone.

avehtari commented 4 weeks ago

My comments on (1) were based on Hyebin Song's talk and discussing with her in BayesComp 2023. At that time, she said there was no practical benefit compared to Geyer's approach, but they had better theory. Song also told that they did not have practical way to do necessary computations, and she promised to send email after they have solved the computation. She has not yet emailed me, so I'm assuming the computation has not been solved. I suggest contacting the authors and ask for an update. At the moment, I don't have time to implement that algorithm and to make any comparison experiments.