bbolker / broom.mixed

tidy methods for mixed models in R
229 stars 24 forks source link

glmmTMB rand_ CIs wrong #108

Open bwiernik opened 4 years ago

bwiernik commented 4 years ago

I am finding that the tidy.glmmTMB() function doesn't report correct CIs (either Wald or profile) for random effects parameters (including sigma), whereas the tidy.lmerMod() function does. Here is an example. It shows the same bounds for both the intercept SD and sigma, and the values are clearly wrong for both (the point estimate falls way outside the interval).

library(broom.mixed)
library(lme4)
lmm1 <- glmmTMB::glmmTMB(Reaction ~ Days + (1 | Subject), sleepstudy)
tidy(lmm1, effects = "ran_pars", conf.int = TRUE, conf.method = "profile")
confint(lmm1)
confint(lmm1, parm = "theta_")

lmm2 <- lme4::lmer(Reaction ~ Days + (1 | Subject), sleepstudy)
tidy(lmm2, effects = "ran_pars", conf.int = TRUE, conf.method = "profile")
confint(lmm2)
bbolker commented 4 years ago

!! thanks. This is not surprising (I will try to write more here) but important.

bbolker commented 4 years ago

Started looking. Basic cause I think is whether sigma is considered a ran_par or not ...

bwiernik commented 4 years ago

I mean, I definitely would regard it as one haha. It bugs me a bit that lm() simga doesn't get a confidence interval by default. What's the difference the handling between glmmTMB and lme4?

bbolker commented 4 years ago

Here are some of the issues.

All that said, I think this is all fairly easily fixable with a bit more investigation and careful coding.

mbojan commented 6 months ago

I now hit this very wall myself. As far as I can tell both CIs for the random effect and residual variance are correctly calculated but get mixed-up when assembling the result in this part https://github.com/bbolker/broom.mixed/blob/b076753d550caa6fd159ca4d68ffb0410d0bff3b/R/glmmTMB_tidiers.R#L116-L119

library(broom.mixed)
library(glmmTMB)

data("sleepstudy", package = "lme4")
fit <- glmmTMB(Reaction ~ Days + (1 | Subject), sleepstudy)

CI for the random effect wrong

tidy(fit, conf.int = TRUE)
#> # A tibble: 4 × 10
#>   effect  component group term  estimate std.error statistic    p.value conf.low
#>   <chr>   <chr>     <chr> <chr>    <dbl>     <dbl>     <dbl>      <dbl>    <dbl>
#> 1 fixed   cond      <NA>  (Int…    251.      9.51       26.4  4.01e-154   233.  
#> 2 fixed   cond      <NA>  Days      10.5     0.802      13.1  5.89e- 39     8.90
#> 3 ran_pa… cond      Subj… sd__…     36.0    NA          NA   NA            27.7 
#> 4 ran_pa… cond      Resi… sd__…     30.9    NA          NA   NA            NA   
#> # ℹ 1 more variable: conf.high <dbl>

Inspecting the transient content of res before and after the offending if:

# # Identifying relevant step
# 
# b <- broom.mixed:::tidy.glmmTMB |>
#   body()
# as.list(b[[2]][[3]][[3]]) # 6: computed CI before "filtering"

trace(
  "tidy.glmmTMB", 
  bquote({
    print(res)
    cat("------------\n")
    }),
  at = list(
    c(2,3,3,5), # Before if()
    c(2,3,3,6)  # After if()
  ),
  where = asNamespace("broom.mixed")
)
#> Tracing function "tidy.glmmTMB" in package "namespace:broom.mixed"
#> [1] "tidy.glmmTMB"

broom.mixed:::tidy.glmmTMB(fit, conf.int = TRUE) |> 
  as.data.frame()
#> Tracing safe_confint(x, method = tolower(conf.method), level = conf.level,  .... step 2,3,3,5 
#>                  2.5 %    97.5 %
#> (Intercept) 232.773317 270.03687
#> Days          8.895915  12.03866
#> ------------
#> Tracing safe_confint(x, method = tolower(conf.method), level = conf.level,  .... step 2,3,3,6 
#>                  2.5 %    97.5 %
#> (Intercept) 232.773317 270.03687
#> Days          8.895915  12.03866
#> ------------
#> Tracing safe_confint(x, parm = thpar, method = conf.method, level = conf.level,  .... step 2,3,3,5 
#>                                2.5 %   97.5 %
#> Std.Dev.(Intercept)|Subject 25.35711 51.14422
#> ------------
#> Tracing safe_confint(x, parm = thpar, method = conf.method, level = conf.level,  .... step 2,3,3,6 
#>      2.5 % 97.5 %
#> ------------
#> Tracing safe_confint(x, parm = "sigma", method = conf.method, level = conf.level,  .... step 2,3,3,5 
#>          2.5 %   97.5 %
#> sigma 27.70801 34.44953
#> ------------
#> Tracing safe_confint(x, parm = "sigma", method = conf.method, level = conf.level,  .... step 2,3,3,6 
#>          2.5 %   97.5 %
#> sigma 27.70801 34.44953
#> ------------
#>     effect component    group            term  estimate std.error statistic
#> 1    fixed      cond     <NA>     (Intercept) 251.40509 9.5061837  26.44648
#> 2    fixed      cond     <NA>            Days  10.46729 0.8017355  13.05579
#> 3 ran_pars      cond  Subject sd__(Intercept)  36.01207        NA        NA
#> 4 ran_pars      cond Residual sd__Observation  30.89544        NA        NA
#>         p.value   conf.low conf.high
#> 1 4.005314e-154 232.773317 270.03687
#> 2  5.889859e-39   8.895915  12.03866
#> 3            NA  27.708012  34.44953
#> 4            NA         NA        NA

The tracing above prints res for each call to safe_confint() twice – before and after the offending if:

  1. This is for fixed effects. It is OK.
  2. Random effect variance. It is erroneously filtered-out (0-row data frame)
  3. For residual variance, it is OK.

As (2) and (3) are rbind()-ed, the CI for residual SD ends up in the 3rd row (for the random effect) rather than in the 4th row for residual SD.

# Cleanup
untrace(
  "tidy.glmmTMB", 
  where = asNamespace("broom.mixed")
)
#> Untracing function "tidy.glmmTMB" in package "namespace:broom.mixed"

Created on 2024-05-13 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.0 (2024-04-24) #> os Ubuntu 22.04.4 LTS #> system x86_64, linux-gnu #> ui X11 #> language en #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/Warsaw #> date 2024-05-13 #> pandoc 3.1.11 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.4.0) #> boot 1.3-30 2024-02-26 [4] CRAN (R 4.3.3) #> broom 1.0.5 2023-06-09 [1] CRAN (R 4.4.0) #> broom.mixed * 0.2.9.5 2024-04-01 [1] CRAN (R 4.4.0) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.4.0) #> codetools 0.2-20 2024-03-31 [4] CRAN (R 4.3.3) #> digest 0.6.35 2024-03-11 [1] CRAN (R 4.4.0) #> dplyr 1.1.4 2023-11-17 [1] CRAN (R 4.4.0) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.4.0) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.0) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.4.0) #> forcats 1.0.0 2023-01-29 [1] CRAN (R 4.4.0) #> fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.0) #> furrr 0.3.1 2022-08-15 [1] CRAN (R 4.4.0) #> future 1.33.2 2024-03-26 [1] CRAN (R 4.4.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.0) #> glmmTMB * 1.1.9 2024-03-20 [1] CRAN (R 4.4.0) #> globals 0.16.3 2024-03-08 [1] CRAN (R 4.4.0) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0) #> knitr 1.46 2024-04-06 [1] CRAN (R 4.4.0) #> lattice 0.22-6 2024-03-20 [4] CRAN (R 4.3.3) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0) #> listenv 0.9.1 2024-01-29 [1] CRAN (R 4.4.0) #> lme4 1.1-35.3 2024-04-16 [1] CRAN (R 4.4.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0) #> MASS 7.3-60.2 2024-04-26 [4] CRAN (R 4.4.0) #> Matrix 1.7-0 2024-04-26 [4] CRAN (R 4.4.0) #> mgcv 1.9-1 2023-12-21 [4] CRAN (R 4.3.2) #> minqa 1.2.6 2023-09-11 [1] CRAN (R 4.4.0) #> nlme 3.1-164 2023-11-27 [4] CRAN (R 4.3.2) #> nloptr 2.0.3 2022-05-26 [1] CRAN (R 4.4.0) #> numDeriv 2016.8-1.1 2019-06-06 [1] CRAN (R 4.4.0) #> parallelly 1.37.1 2024-02-29 [1] CRAN (R 4.4.0) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.4.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.0) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.4.0) #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.4.0) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.4.0) #> rmarkdown 2.26 2024-03-05 [1] CRAN (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.4.0) #> tidyr 1.3.1 2024-01-24 [1] CRAN (R 4.4.0) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.0) #> TMB 1.9.11 2024-04-03 [1] CRAN (R 4.4.0) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.0) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.4.0) #> xfun 0.43 2024-03-25 [1] CRAN (R 4.4.0) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.4.0) #> #> [1] /home/mbojan/R/library/4.4 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```