easystats / performance

:muscle: Models' quality and performance metrics (R2, ICC, LOO, AIC, BF, ...)
https://easystats.github.io/performance/
GNU General Public License v3.0
1.02k stars 90 forks source link

Marginal R2 is a misnomer in brms #526

Closed vincentarelbundock closed 3 months ago

vincentarelbundock commented 1 year ago

According to the brms author, performance should change its label:

https://twitter.com/paulbuerkner/status/1608896620699226113?s=46&t=3uzWoIWG2nAVyzeyGrxTEg

DominiqueMakowski commented 1 year ago

If we can expect to obtain a better marginal R2, we could explicit that in the documentation (rather than changing the name which would be super breaking probably) and try to find a way to compute a better score - assuming this is a goal we can reasonably have

strengejacke commented 1 year ago

It's a possible misnomer for Bayesian models, not for frequentist.

vincentarelbundock commented 1 year ago

TBC, I have no expertise here. I just wanted to bring your attention to this because it seemed like a credible argument from authority. Feel free to close once you decide what should be done.

strengejacke commented 1 year ago

I think you can use r2_nakagawa() with Bayesian models, too, so we would have "valid" marginal and conditional R2, but that function is also not fully validated for more exotic model families.

bwiernik commented 1 year ago

I'm not sure what you mean @strengejacke about the distinction between Bayesian and frequentist frameworks?

strengejacke commented 1 year ago

For frequentist models, we follow the approach from Nakagawa et. al, computing the variances for the different components (via insight::get_variance(), and then - similar to ICC - divide fixed effects vs. total variance or FE+RE vs. total, see

https://github.com/easystats/performance/blob/9b9540975d642d0432d7c62d306779d3c4e5db98/R/r2_nakagawa.R#L107-L110

This is the distinction of "marginal" vs. "conditional" proposed by Nakagawa et al., and which is in line with many other packages that compute R2 for mixed models.

For Bayesian mixed models, we rely on bayes_R2() and once set re.form = NA and the other time re.form = NULL, where bayes_R2() internally calls posterior_epred() with the re.form argument. We call these two "marginal" and "conditional", which may be a misnomer, because since we rely on "predictions", "marginalizing" over random effects is not achieved by setting re.form either to NA or NULL. Thus, "marginal" is probably not the correct term when thinking in predictions.

But, since we're interested in the variance, re.form = NA and re.form = NULL still might be an equivalent to marginal and conditional R2 in the frequentist framework, because results are fairly similar, see:

library(lme4)
library(brms)
library(performance)

data(sleepstudy)
data(cbpp)
cbpp$out <- ifelse(cbpp$incidence / cbpp$size > 0.2, 1L, 0L)

m1 <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy)
m2 <- brm(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy, refresh = 0)

gm1 <- glmer(
  out ~ period + (1 | herd),
  data = cbpp,
  family = binomial
)

gm2 <- brm(
  out ~ period + (1 | herd),
  data = cbpp,
  family = bernoulli,
  refresh = 0
)

r2(m1)
r2(m2)

r2(gm1)
r2(gm2)

Hence, in terms of "predictions", we don't have predictions marginalized over random effects, but since we're not interested in point estimates, but rather in the variances, we still might produce the "correct" marginal and conditional R2. Not sure about this, though, and that's why there might be a misnomer.

strengejacke commented 1 year ago

r2() calls r2_bayes() for Bayesian (mixed) models, while r2() calls r2_nakagawa() for frequentist mixed models. But I think r2_nakagawa() can also be used for Bayesian models, just not for too exotic families.

bwiernik commented 1 year ago

Okay yeah I agree. I think May not really be a Bayes/freq thing but rather than Paul would probably just object to the Nakagawa use of marginal there.

mattansb commented 1 year ago

But I think r2_nakagawa() can also be used for Bayesian models, just not for too exotic families.

It already can:

library(brms)
library(performance)

data(sleepstudy)
data(cbpp)
cbpp$out <- ifelse(cbpp$incidence / cbpp$size > 0.2, 1L, 0L)

m2 <- brm(Reaction ~ Days + (1 + Days | Subject), 
          data = sleepstudy, 
          refresh = 0, backend = "cmdstanr")

r2_bayes(m2)
#> # Bayesian R2 with Compatibility Interval
#> 
#>   Conditional R2: 0.793 (95% CI [0.758, 0.824])
#>      Marginal R2: 0.287 (95% CI [0.160, 0.406])

r2_nakagawa(m2)
#> # R2 for Mixed Models
#> 
#>   Conditional R2: 0.822
#>      Marginal R2: 0.240
mattansb commented 1 year ago

Am I understanding correctly that there isn't any problem with the code per se, but the issue is with the labeling?

So it's not marginal in the sense that the random effect's aren't integrated out, but are "ignored" (as in https://twitter.com/tjmahr/status/1581563839459385344). Correct? If this is the case, I think clearing this up in the docs is enough?

How does this explain why the original tweet had the marginal larger than the conditional?

strengejacke commented 1 year ago

How does this explain why the original tweet had the marginal larger than the conditional?

Maybe related to https://github.com/easystats/insight/issues/664

bwiernik commented 1 year ago

Might also be related to https://discourse.mc-stan.org/t/brms-how-to-get-the-icc-or-vpc-value-for-a-multi-level-negative-binomial-model/24046/3?u=bwiernik