xfim / ggmcmc

Graphical tools for analyzing Markov Chain Monte Carlo simulations from Bayesian inference
111 stars 31 forks source link

ggs_Rhat gives different answers than gelman.diag #65

Closed jfiksel closed 5 years ago

jfiksel commented 5 years ago

Hi! This is a great package, but I just had a question about the ggs_Rhat function. It seems to be giving different estimates than the gelman.diag function from the coda package. I'm not sure if they are different, but I thought I'd put this out there to see if this is either an explanation, or to learn about the difference between the two functions. Here's a reproducible example:

library(coda)
data(radon)
radon.long <- radon$s.radon
g <- gelman.diag(radon.long, autoburnin = F, multivariate = F)$psrf
r <- ggs_Rhat(ggs(radon.long))$data

g[rownames(g) == "sigma.beta",]
r[r$Parameter == "sigma.beta",]

You can see that the gelman.diag estimate gives a point estimate of 1.22 for this parameter, while the ggs_Rhat function gives a point estimate of 1.09.

xfim commented 5 years ago

Hello @jfiksel ,

Thank you very much for your kind words, and for reporting this.

I just solved an issue with the geweke.diag() a few days ago (see #64). The difference was in that case due to the consideration of percent of the sample that coda makes. I will take a look at this also.

Thank you very much again for your contribution.

jfiksel commented 5 years ago

Great! I wrote a function to turn the output from gelman.diag into a tibble, which will perhaps make it easier for you to compare the outputs during testing:

getRhat <- function(mcmc.list) {
    rhat <- gelman.diag(mcmc.list, multivariate = F, autoburnin = F)$psrf
    rhat.df <- tibble(Parameter = rownames(rhat), Rhat = unname(rhat[,1]))
    return(rhat.df)
}
xfim commented 5 years ago

So, as it was stated in the documentation, ggs_Rhat() uses the Gelman, Carlin, Stern and Rubin 2003 version of the Rhat, which is not the one used in the coda package. It differs slightly.

But as this has been raised in the past, I have decided to allow ggs_Rhat() to use the coda version of the Potential Scale Reduction Factor by adding an argument "version_Rhat" that allows to specify the coda version ("BR98") or the default ("BDA2") (see the documentation for the acronyms). Commit 0a7857d does this.

So from version 1.3 onwards, ggmcmc will allow to use the coda version of the Rhat, while still keeping the "BDA2" version as the default.

What has taken me a bit long is to also document these differences in different calculations and their differences: http://blog.xavier-fim.net/2019/03/comparison-of-rhat-versions-clarifying-formulas-for-the-potential-scale-reduction-factor-and-its-implications/

I have followed your example to create the datasets.

I hope this clarifies the differences.

jfiksel commented 5 years ago

Excellent post, I was not aware of the different versions of the Rhat measure! Thanks so much for digging into this.

xfim commented 5 years ago

Thank you. Glad that it worked for you.