easystats / modelbased

:chart_with_upwards_trend: Estimate effects, contrasts and means based on statistical models
https://easystats.github.io/modelbased/
GNU General Public License v3.0
234 stars 19 forks source link

Possible bug with estimate_grouplevel() for brms model #229

Open jmgirard opened 1 year ago

jmgirard commented 1 year ago

Both type = "random" and type = "total" seem to return the same estimate, but I think total should be the random + fixed.

library(brms)
library(modelbased)

data("klein2000", package = "multilevel")

fit <- brm(
  COHES ~ 1 + POSAFF + (1 + POSAFF | GRPID),
  data = klein2000,
  prior = set_prior("normal(0,1)", class = "b"),
  chains = 4,
  cores = 4
)
#> Compiling Stan program...
#> Start sampling

estimate_grouplevel(fit, type = "random") |> head(5)
#> Group |       Level |             Parameter |   Component | Median |        95% CI |     pd |  Rhat |     ESS
#> -------------------------------------------------------------------------------------------------------------
#> GRPID | (Intercept) |   sd_GRPID__Intercept | conditional |   1.22 | [ 0.96, 1.56] |   100% | 1.001 | 1373.00
#> GRPID |     GRPID.1 |  r_GRPID[1,Intercept] | conditional |   0.95 | [-0.14, 1.96] | 95.45% | 0.999 | 4860.00
#> GRPID |     GRPID.1 |     r_GRPID[1,POSAFF] | conditional |  -0.07 | [-0.60, 0.19] | 74.40% | 1.000 | 3342.00
#> GRPID |    GRPID.10 | r_GRPID[10,Intercept] | conditional |  -0.44 | [-1.48, 0.61] | 79.17% | 1.000 | 4104.00
#> GRPID |    GRPID.10 |    r_GRPID[10,POSAFF] | conditional |   0.01 | [-0.28, 0.38] | 58.65% | 1.000 | 4718.00

estimate_grouplevel(fit, type = "total") |> head(5)
#> Group |       Level |             Parameter |   Component | Median |     pd |  Rhat |     ESS
#> ---------------------------------------------------------------------------------------------
#> GRPID | (Intercept) |   sd_GRPID__Intercept | conditional |   1.22 |   100% | 1.001 | 1373.00
#> GRPID |     GRPID.1 |  r_GRPID[1,Intercept] | conditional |   0.95 | 95.45% | 0.999 | 4860.00
#> GRPID |     GRPID.1 |     r_GRPID[1,POSAFF] | conditional |  -0.07 | 74.40% | 1.000 | 3342.00
#> GRPID |    GRPID.10 | r_GRPID[10,Intercept] | conditional |  -0.44 | 79.17% | 1.000 | 4104.00
#> GRPID |    GRPID.10 |    r_GRPID[10,POSAFF] | conditional |   0.01 | 58.65% | 1.000 | 4718.00

Created on 2023-06-08 with reprex v2.0.2

jmgirard commented 1 year ago

Here is the lme4 version for comparison:

library(lme4)
library(modelbased)

data("klein2000", package = "multilevel")

fit2 <- lmer(
  COHES ~ 1 + POSAFF + (1 + POSAFF | GRPID),
  data = klein2000
)
#> boundary (singular) fit: see help('isSingular')

estimate_grouplevel(fit2, type = "random") |> head(5)
#> Group | Level |   Parameter | Coefficient |   SE |         95% CI
#> -----------------------------------------------------------------
#> GRPID |     1 | (Intercept) |        0.96 | 0.51 | [-0.04,  1.97]
#> GRPID |     1 |      POSAFF |       -0.12 | 0.06 | [-0.24,  0.01]
#> GRPID |     2 | (Intercept) |        0.19 | 0.53 | [-0.84,  1.22]
#> GRPID |     2 |      POSAFF |       -0.02 | 0.06 | [-0.15,  0.10]
#> GRPID |     3 | (Intercept) |       -2.61 | 0.53 | [-3.64, -1.57]

estimate_grouplevel(fit2, type = "total") |> head(5)
#> Group | Level |   Parameter | Coefficient
#> -----------------------------------------
#> GRPID |     1 | (Intercept) |        1.18
#> GRPID |     1 |      POSAFF |       -0.04
#> GRPID |     2 | (Intercept) |        0.40
#> GRPID |     2 |      POSAFF |        0.05
#> GRPID |     3 | (Intercept) |       -2.39

Created on 2023-06-08 with reprex v2.0.2

DominiqueMakowski commented 1 year ago

I think this is because brms doesn't have the option to get one or the other. I have vague memories of a discussion somewhere with Paul or something about that, and whether it would be good enough to just add the "random" values to a point-estimate of the fixed effect (median) but probably not.

That said, if it's not already there, we should add it to the documentation that this option only works for certain packages (until brms provides a way of obtaining total random effects)

mattansb commented 1 year ago

This can easily be done with coef.brmsfit().

coef.brmsfit() summarizes the draws by default:

coefs <- coef(fit)[[1]]

# reshape the data:
coefs_tidy <- cbind(expand.grid(dimnames(coefs)[c(1,3)]),
                    apply(coefs, 2, rbind))

head(coefs_tidy)
#>   Var1      Var2   Estimate Est.Error       Q2.5     Q97.5
#> 1    1 Intercept  1.1460460 0.5214133  0.1196870  2.146623
#> 2    2 Intercept  0.4452231 0.5223174 -0.5778876  1.486635
#> 3    3 Intercept -2.3487765 0.5316030 -3.4068705 -1.331445
#> 4    4 Intercept  0.5878943 0.5405031 -0.4606016  1.641785
#> 5    5 Intercept  0.7635094 0.5137324 -0.2565961  1.777431
#> 6    6 Intercept  0.3860397 0.5282088 -0.6493834  1.411480

If you want more control (here I'm using posterior::rvar() to keep the draws tidy):

coefs_rvar <- coef(fit, summary = FALSE)[[1]] |> 
  posterior::rvar()

head(coefs_rvar)
#> rvar<4000>[6,2] mean ± sd:
#>   Intercept      POSAFF        
#> 1  1.146 ± 0.52  -0.036 ± 0.21 
#> 2  0.445 ± 0.52   0.134 ± 0.20 
#> 3 -2.349 ± 0.53   0.291 ± 0.26 
#> 4  0.588 ± 0.54   0.040 ± 0.20 
#> 5  0.764 ± 0.51   0.115 ± 0.19 
#> 6  0.386 ± 0.53   0.056 ± 0.18  

# Reshape:
coefs_long <- as.data.frame(coefs_rvar) |>
  tibble::rowid_to_column("GRPID") |> 
  tidyr::pivot_longer(-GRPID)

head(coefs_long)
#> # A tibble: 6 × 3
#>   GRPID name               value
#>   <int> <chr>         <rvar[1d]>
#> 1     1 Intercept   1.146 ± 0.52
#> 2     1 POSAFF     -0.036 ± 0.21
#> 3     2 Intercept   0.445 ± 0.52
#> 4     2 POSAFF      0.134 ± 0.20
#> 5     3 Intercept  -2.349 ± 0.53
#> 6     3 POSAFF      0.291 ± 0.26

# Process the rvars how ever you like:
post_summ <- bayestestR::describe_posterior(coefs_long$value, test = NULL)

# add the grid infor back and clean up
post_summ_with_grid <- post_summ |>
  dplyr::bind_cols(coefs_long[,1:2]) |> 
  dplyr::relocate(GRPID , name, .before = 1) |> 
  dplyr::select(-Parameter)

head(post_summ_with_grid)
#> Summary of Posterior Distribution
#> 
#> GRPID |      name |    Median |         95% CI
#> ----------------------------------------------
#> 1.00  | Intercept |      1.14 | [ 0.12,  2.15]
#> 1.00  |    POSAFF | -4.74e-03 | [-0.53,  0.30]
#> 2.00  | Intercept |      0.44 | [-0.58,  1.49]
#> 2.00  |    POSAFF |      0.11 | [-0.21,  0.60]
#> 3.00  | Intercept |     -2.35 | [-3.41, -1.33]
#> 3.00  |    POSAFF |      0.25 | [-0.13,  0.86]
mattansb commented 1 year ago

Use ranef.brmsfit() instead to only get the random part.

jmgirard commented 1 year ago

@mattansb This is great for my purposes. Perhaps we could use these functions to build updated methods for modelbased::estimate_grouplevel()?