Closed maxsitt closed 2 years ago
Despite the minor differences between performance::r2()
and MuMIn::r.squaredGLMM()
, both packages report different R2 depending on whether offset is included in the formula or not.
library(lme4)
#> Loading required package: Matrix
library(performance)
library(MuMIn)
m1 <- lmer(log(mpg) ~ disp + (1|cyl) + offset(log(wt)), data = mtcars)
r2(m1)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.817
#> Marginal R2: 0.757
r.squaredGLMM(m1)
#> Warning: 'r.squaredGLMM' now calculates a revised statistic. See the help page.
#> R2m R2c
#> [1,] 0.7573362 0.8171651
m2 <- lmer(log(mpg) ~ disp + (1|cyl), offset = log(wt), data = mtcars)
r2(m2) # same R2 values
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.817
#> Marginal R2: 0.757
r.squaredGLMM(m2)
#> R2m R2c
#> [1,] 0.7573362 0.8171651
m3 <- suppressWarnings(glmer.nb(mpg ~ disp + (1|cyl) + offset(log(wt)), data = mtcars))
r2(m3)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.815
#> Marginal R2: 0.799
r.squaredGLMM(m3)
#> Warning: the null model is correct only if all variables used by the original
#> model remain unchanged.
#> R2m R2c
#> delta 0.7960440 0.8127380
#> lognormal 0.8010033 0.8178012
#> trigamma 0.7908055 0.8073896
m4 <- suppressWarnings(glmer.nb(mpg ~ disp + (1|cyl), offset = log(wt), data = mtcars))
r2(m4) # different R2 values (because of missing offset term?)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.655
#> Marginal R2: 0.642
r.squaredGLMM(m4)
#> Warning: the null model is correct only if all variables used by the original
#> model remain unchanged.
#> R2m R2c
#> delta 0.6448313 0.6583541
#> lognormal 0.6607767 0.6746339
#> trigamma 0.6272789 0.6404336
The main difference I found, which might be the reason for the different R2 in both packages, is how null-models are calculated:
library(lme4)
#> Loading required package: Matrix
m3 <- suppressWarnings(glmer.nb(mpg ~ disp + (1|cyl) + offset(log(wt)), data = mtcars))
m4 <- suppressWarnings(glmer.nb(mpg ~ disp + (1|cyl), offset = log(wt), data = mtcars))
insight::null_model(m3)
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#> Approximation) [glmerMod]
#> Family: Negative Binomial(49.3607) ( log )
#> Formula: mpg ~ (1 | cyl)
#> Data: mtcars
#> AIC BIC logLik deviance df.resid
#> 189.7125 194.1097 -91.8563 183.7125 29
#> Random effects:
#> Groups Name Std.Dev.
#> cyl (Intercept) 0.223
#> Number of obs: 32, groups: cyl, 3
#> Fixed Effects:
#> (Intercept)
#> 2.993
insight::null_model(m4)
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#> Approximation) [glmerMod]
#> Family: Negative Binomial(49.3607) ( log )
#> Formula: mpg ~ (1 | cyl)
#> Data: mtcars
#> Offset: log(wt)
#> AIC BIC logLik deviance df.resid
#> 227.6811 232.0783 -110.8405 221.6811 29
#> Random effects:
#> Groups Name Std.Dev.
#> cyl (Intercept) 0.4643
#> Number of obs: 32, groups: cyl, 3
#> Fixed Effects:
#> (Intercept)
#> 1.891
closes in https://github.com/easystats/insight/commit/c6e75101addd220638ba73f50fe65d50dc76bb6c
library(lme4)
#> Loading required package: Matrix
library(performance)
library(insight)
m1 <- lmer(log(mpg) ~ disp + (1|cyl) + offset(log(wt)), data = mtcars)
r2(m1)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.817
#> Marginal R2: 0.757
find_offset(m1)
#> [1] "wt"
m2 <- lmer(log(mpg) ~ disp + (1|cyl), offset = log(wt), data = mtcars)
r2(m2) # same R2 values
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.817
#> Marginal R2: 0.757
find_offset(m2)
#> [1] "wt"
m3 <- suppressWarnings(glmer.nb(mpg ~ disp + (1|cyl) + offset(log(wt)), data = mtcars))
r2(m3)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.655
#> Marginal R2: 0.642
find_offset(m3)
#> [1] "wt"
m4 <- suppressWarnings(glmer.nb(mpg ~ disp + (1|cyl), offset = log(wt), data = mtcars))
r2(m4) # different R2 values (because of missing offset term?)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.655
#> Marginal R2: 0.642
find_offset(m4)
#> [1] "wt"
Created on 2022-05-06 by the reprex package (v2.0.1)
I'm posting this here, but the main problem might be in the insight package. But I got aware of this via the R2 output of performance. The
glmer.nb()
models are probably nonsense with the mtcars data, I just wanted to reproduce the different R2 output (my data is fitted with negative binomial distribution).Additional question: What is, from your point of view, the more "correct" or more often used offset term formulation? I have seen
+ offset()
more often, but not sure if this is really the "better" formulation.Created on 2022-04-01 by the reprex package (v2.0.1)