easystats / bayestestR

:ghost: Utilities for analyzing Bayesian models and posterior distributions
https://easystats.github.io/bayestestR/
GNU General Public License v3.0
577 stars 55 forks source link

INLA support for Bayes factors #457

Open Crismoc opened 3 years ago

Crismoc commented 3 years ago

Would it be possible to support Bayes factors with INLA? Given its application for fitting models when other packages using MCMC are too computationally demanding, it would be quite useful. I'm not sure if it makes sense in statistical terms, but when using INLA there are also priors and marginal likelihoods involved. I imagine something fitting in the package workflow, like the following:

set.seed(123)
library(rstanarm)

library(bayestestR)
model <- stan_glm(
  formula = extra ~ group,
  data = sleep,
  prior = normal(0, 3, autoscale = FALSE)
)

bayesfactor_parameters(model, null = c(-1, 1))
#> Sampling priors, please wait...
#> Bayes Factor (Null-Interval)
#> 
#> Parameter   |    BF
#> -------------------
#> (Intercept) | 0.099
#> group2      | 0.886
#> 
#> * Evidence Against The Null: [-1.000, 1.000]

library(INLA)
m_inla <- inla(
  formula = extra ~ group,
  data = sleep,
  control.fixed = list(mean = list(prior = "normal", param = c(0, 3))))

bayesfactor_parameters(m_inla, null = c(-1, 1))
#> Error in UseMethod("bayesfactor_parameters"): no applicable method for 'bayesfactor_parameters' applied to an object of class "inla"
mattansb commented 3 years ago

Hi @Crismoc,

The way parameter-wise Bayes factors are computed in bayestestR, we need to be able to sample from the prior distribution. Is that possible with INLA?

Crismoc commented 3 years ago

Hi @mattansb , thanks for considering this request. Not having much experience with the use of INLA, I can only say that after looking at the INLA manual (link) and books like Bayesian regression modeling with INLA (link), the concept of prior predictive checks is not mentioned. I also looked at the google group of INLA and couldn't find a topic about sampling priors or doing ppc. Nonetheless, it is clear that priors can be defined.

I could create a topic in the package group if this helps to assess the feasibility of this enhancement. Just let me know.

bwiernik commented 3 years ago

Do you have links to any resources on the estimation algorithms INLA uses? I know Laplace approximation at a very high level, but more details. We may also be able to develop analytic methods for INLA and perhaps BayesFactor models for some statistics

mattansb commented 3 years ago

Is it possible to get the posterior samples? Or only a summary?

bwiernik commented 3 years ago

My understanding is that INLA doesn't have samples. Because it uses a Laplace Approximation, the posterior information is that summary

Crismoc commented 3 years ago

I sent a question regarding the algorithm to the R-INLA maintainers. I have the same understanding as @bwiernik about how the INLA approach works. As it can be visualized (here) the posterior distribution is not built on samples, but on an approximation that results in one posterior distribution for each parameter.

mattansb commented 3 years ago

Hmmm.... I wonder what we can do. (BayesFactor also doesn't sample, but does allow to sample by request)

andreanorcinipala commented 2 years ago

Hi all, here my two cents. I actually believe that it is possible to get posterior samples through INLA. This is the function inla.posterior.sample (https://rdrr.io/github/inbo/INLA/man/posterior.sample.html) . Hope it helps!

mattansb commented 2 years ago

Looks good!

Need to have INLA support in insight, and then we can get that done here as well.

set.seed(123)
library(bayestestR)
library(INLA)

m_inla <- inla(
  formula = extra ~ group,
  data = sleep,
  control.fixed = list(mean = list(prior = "normal", param = c(0, 3))),
  control.compute=list(config = TRUE)) # WONT WORK WITHOUT THIS

# to be used in: insight::get_parameters.inla
posts <- inla.posterior.sample(m_inla, n = 4000)
posts <- lapply(posts, "[[", 2)
posts <- do.call(cbind, posts)
posts <- posts[!grepl("Predictor:", rownames(posts), fixed = TRUE), ]
posts <- as.data.frame(t(posts))

describe_posterior(posts)
#> Summary of Posterior Distribution
#> 
#> Parameter     | Median |        95% CI |     pd |          ROPE | % in ROPE
#> ---------------------------------------------------------------------------
#> (Intercept):1 |   0.76 | [-0.42, 1.95] | 89.55% | [-0.10, 0.10] |     6.42%
#> group2:1      |   1.58 | [-0.02, 3.35] | 96.73% | [-0.10, 0.10] |     1.08%

# Compare to:
summary(m_inla)$fixed
#>              mean    sd 0.025quant 0.5quant 0.975quant  mode kld
#> (Intercept) 0.751 0.596     -0.430    0.751      1.931 0.750   0
#> group2      1.579 0.842     -0.091    1.579      3.247 1.579   0

Created on 2022-02-13 by the reprex package (v2.0.1)