easystats / bayestestR

:ghost: Utilities for analyzing Bayesian models and posterior distributions
https://easystats.github.io/bayestestR/
GNU General Public License v3.0
576 stars 55 forks source link

Iris data example for Bayes Factor: with or without random effect? #623

Open rduesing opened 1 year ago

rduesing commented 1 year ago

I was going through the examples to calculte Bayes Factors (BF) for model comparison. It all works out and is neatly described, but I was wondering, why 'Species' is modeled as a fixed effect predictor in the brms example (together with 'Petal.Length'), but as random effect in the frequentist lme4 example (and now the two predictors are 'Petal.Length' and 'Petal.Width').

I think it is debatable which approach is more suitable (I would tend to the brms soulution, since 'Species' has only 3 levels), but this is not the point of my question. Is there any reason, why it is presented that way, which I miss?

In my opinion it would be more comprehensible, if the models are the same (same predictors and all with or without random effects). Additionally, if you use the (g)lm function and calculate the BFs with the bayesfactor_models command, the BFs are more similar to the brms results reported in the example. Here the code and results with fixed effects with glm:

m0.f <- glm(formula = Sepal.Length ~ 1, family = gaussian, data = iris)
m1.f <- glm(formula = Sepal.Length ~ Petal.Length, family = gaussian, data = iris)
m2.f <- glm(formula = Sepal.Length ~ Species, family = gaussian, data = iris)
m3.f <- glm(formula = Sepal.Length ~ Species + Petal.Length, family = gaussian, data = iris)
m4.f <- glm(formula = Sepal.Length ~ Species * Petal.Length, family = gaussian, data = iris)

bayesfactor_models(m1.f, m2.f, m3.f, m4.f, denominator = m0.f)
       Model                        BF
[m1.f] Petal.Length           2.45e+45
[m2.f] Species                1.70e+29
[m3.f] Species + Petal.Length 5.84e+55
[m4.f] Species * Petal.Length 2.20e+54

* Against Denominator: [m0.f] (Intercept only)
*   Bayes Factor Type: BIC approximation

And here the results from the brms models from the bayestestR example page:

> Model                        BF
> [1] Petal.Length           1.27e+44
> [2] Species                8.34e+27
> [3] Species + Petal.Length 2.29e+53
> [4] Species * Petal.Length 9.79e+51
mattansb commented 1 year ago

The conclusion here is that we need better example data 😅

Thanks, I will (eventually) get to cleaning up that vignette.

strengejacke commented 1 year ago

I would take the example cum grano salis, their main purpose is to show how functions work, not to provide "meaningful" models or hypotheses. Of course, demonstrating something with models or data that make sense is often a bit clearer as example, but as Mattan said, variety of data is limited ;-)

rduesing commented 1 year ago

I would take the example cum grano salis, their main purpose is to show how functions work, not to provide "meaningful" models or hypotheses. Of course, demonstrating something with models or data that make sense is often a bit clearer as example, but as Mattan said, variety of data is limited ;-)

Thanks for considering my request. As I noted above, it is not about the models itself and which fits better to the data generating process, but the consistency in the examples. I think it is more comprehensible, if only the method (Bayes vs. Frequentist) changes and the models remain the same for both approaches.