glmmTMB rand_ CIs wrong

bwiernik commented 4 years ago

I am finding that the tidy.glmmTMB() function doesn't report correct CIs (either Wald or profile) for random effects parameters (including sigma), whereas the tidy.lmerMod() function does. Here is an example. It shows the same bounds for both the intercept SD and sigma, and the values are clearly wrong for both (the point estimate falls way outside the interval).

library(broom.mixed)
library(lme4)
lmm1 <- glmmTMB::glmmTMB(Reaction ~ Days + (1 | Subject), sleepstudy)
tidy(lmm1, effects = "ran_pars", conf.int = TRUE, conf.method = "profile")
confint(lmm1)
confint(lmm1, parm = "theta_")

lmm2 <- lme4::lmer(Reaction ~ Days + (1 | Subject), sleepstudy)
tidy(lmm2, effects = "ran_pars", conf.int = TRUE, conf.method = "profile")
confint(lmm2)

bbolker commented 4 years ago

!! thanks. This is not surprising (I will try to write more here) but important.

bbolker commented 4 years ago

Started looking. Basic cause I think is whether sigma is considered a ran_par or not ...

bwiernik commented 4 years ago

I mean, I definitely would regard it as one haha. It bugs me a bit that lm() simga doesn't get a confidence interval by default. What's the difference the handling between glmmTMB and lme4?

bbolker commented 4 years ago

Here are some of the issues.

In lm() and lmer, sigma is a parameter that's profiled out - that is, it's never explicitly estimated. (Easiest to understand in the context of regular linear regression, where sigma is computed as the square root of the residual variance.) Therefore, it's rarely considered a parameter of the model. (Although this argument breaks down slightly for lmer, where the fixed-effect parameters are also profiled out of the model - their values and confidence intervals are computed after fitted.) The residual SD is probably also frequently ignored because it has a different sampling distribution from the fixed-effect parameters (sqrt-chi-squared).
in glmmTMB, sigma is estimated as part of the dispersion model, rather than part of the random effects model. That is, the other ran_pars (sd/correlation/whatever) are estimated as part of the conditional or zero-inflated model; the dispersion model is a separate sub-model.

All that said, I think this is all fairly easily fixable with a bit more investigation and careful coding.

mbojan commented 6 months ago

I now hit this very wall myself. As far as I can tell both CIs for the random effect and residual variance are correctly calculated but get mixed-up when assembling the result in this part https://github.com/bbolker/broom.mixed/blob/b076753d550caa6fd159ca4d68ffb0410d0bff3b/R/glmmTMB_tidiers.R#L116-L119

library(broom.mixed)
library(glmmTMB)

data("sleepstudy", package = "lme4")
fit <- glmmTMB(Reaction ~ Days + (1 | Subject), sleepstudy)

CI for the random effect wrong

tidy(fit, conf.int = TRUE)
#> # A tibble: 4 × 10
#>   effect  component group term  estimate std.error statistic    p.value conf.low
#>   <chr>   <chr>     <chr> <chr>    <dbl>     <dbl>     <dbl>      <dbl>    <dbl>
#> 1 fixed   cond      <NA>  (Int…    251.      9.51       26.4  4.01e-154   233.  
#> 2 fixed   cond      <NA>  Days      10.5     0.802      13.1  5.89e- 39     8.90
#> 3 ran_pa… cond      Subj… sd__…     36.0    NA          NA   NA            27.7 
#> 4 ran_pa… cond      Resi… sd__…     30.9    NA          NA   NA            NA   
#> # ℹ 1 more variable: conf.high <dbl>

Inspecting the transient content of res before and after the offending if:

# # Identifying relevant step
# 
# b <- broom.mixed:::tidy.glmmTMB |>
#   body()
# as.list(b[[2]][[3]][[3]]) # 6: computed CI before "filtering"

trace(
  "tidy.glmmTMB", 
  bquote({
    print(res)
    cat("------------\n")
    }),
  at = list(
    c(2,3,3,5), # Before if()
    c(2,3,3,6)  # After if()
  ),
  where = asNamespace("broom.mixed")
)
#> Tracing function "tidy.glmmTMB" in package "namespace:broom.mixed"
#> [1] "tidy.glmmTMB"

broom.mixed:::tidy.glmmTMB(fit, conf.int = TRUE) |> 
  as.data.frame()
#> Tracing safe_confint(x, method = tolower(conf.method), level = conf.level,  .... step 2,3,3,5 
#>                  2.5 %    97.5 %
#> (Intercept) 232.773317 270.03687
#> Days          8.895915  12.03866
#> ------------
#> Tracing safe_confint(x, method = tolower(conf.method), level = conf.level,  .... step 2,3,3,6 
#>                  2.5 %    97.5 %
#> (Intercept) 232.773317 270.03687
#> Days          8.895915  12.03866
#> ------------
#> Tracing safe_confint(x, parm = thpar, method = conf.method, level = conf.level,  .... step 2,3,3,5 
#>                                2.5 %   97.5 %
#> Std.Dev.(Intercept)|Subject 25.35711 51.14422
#> ------------
#> Tracing safe_confint(x, parm = thpar, method = conf.method, level = conf.level,  .... step 2,3,3,6 
#>      2.5 % 97.5 %
#> ------------
#> Tracing safe_confint(x, parm = "sigma", method = conf.method, level = conf.level,  .... step 2,3,3,5 
#>          2.5 %   97.5 %
#> sigma 27.70801 34.44953
#> ------------
#> Tracing safe_confint(x, parm = "sigma", method = conf.method, level = conf.level,  .... step 2,3,3,6 
#>          2.5 %   97.5 %
#> sigma 27.70801 34.44953
#> ------------
#>     effect component    group            term  estimate std.error statistic
#> 1    fixed      cond     <NA>     (Intercept) 251.40509 9.5061837  26.44648
#> 2    fixed      cond     <NA>            Days  10.46729 0.8017355  13.05579
#> 3 ran_pars      cond  Subject sd__(Intercept)  36.01207        NA        NA
#> 4 ran_pars      cond Residual sd__Observation  30.89544        NA        NA
#>         p.value   conf.low conf.high
#> 1 4.005314e-154 232.773317 270.03687
#> 2  5.889859e-39   8.895915  12.03866
#> 3            NA  27.708012  34.44953
#> 4            NA         NA        NA

The tracing above prints res for each call to safe_confint() twice – before and after the offending if:

This is for fixed effects. It is OK.
Random effect variance. It is erroneously filtered-out (0-row data frame)
For residual variance, it is OK.

As (2) and (3) are rbind()-ed, the CI for residual SD ends up in the 3rd row (for the random effect) rather than in the 4th row for residual SD.

# Cleanup
untrace(
  "tidy.glmmTMB", 
  where = asNamespace("broom.mixed")
)
#> Untracing function "tidy.glmmTMB" in package "namespace:broom.mixed"

^{Created on 2024-05-13 with reprex v2.1.0}

Session info

``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.0 (2024-04-24) #> os Ubuntu 22.04.4 LTS #> system x86_64, linux-gnu #> ui X11 #> language en #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/Warsaw #> date 2024-05-13 #> pandoc 3.1.11 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.4.0) #> boot 1.3-30 2024-02-26 [4] CRAN (R 4.3.3) #> broom 1.0.5 2023-06-09 [1] CRAN (R 4.4.0) #> broom.mixed * 0.2.9.5 2024-04-01 [1] CRAN (R 4.4.0) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.4.0) #> codetools 0.2-20 2024-03-31 [4] CRAN (R 4.3.3) #> digest 0.6.35 2024-03-11 [1] CRAN (R 4.4.0) #> dplyr 1.1.4 2023-11-17 [1] CRAN (R 4.4.0) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.4.0) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.0) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.4.0) #> forcats 1.0.0 2023-01-29 [1] CRAN (R 4.4.0) #> fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.0) #> furrr 0.3.1 2022-08-15 [1] CRAN (R 4.4.0) #> future 1.33.2 2024-03-26 [1] CRAN (R 4.4.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.0) #> glmmTMB * 1.1.9 2024-03-20 [1] CRAN (R 4.4.0) #> globals 0.16.3 2024-03-08 [1] CRAN (R 4.4.0) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0) #> knitr 1.46 2024-04-06 [1] CRAN (R 4.4.0) #> lattice 0.22-6 2024-03-20 [4] CRAN (R 4.3.3) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0) #> listenv 0.9.1 2024-01-29 [1] CRAN (R 4.4.0) #> lme4 1.1-35.3 2024-04-16 [1] CRAN (R 4.4.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0) #> MASS 7.3-60.2 2024-04-26 [4] CRAN (R 4.4.0) #> Matrix 1.7-0 2024-04-26 [4] CRAN (R 4.4.0) #> mgcv 1.9-1 2023-12-21 [4] CRAN (R 4.3.2) #> minqa 1.2.6 2023-09-11 [1] CRAN (R 4.4.0) #> nlme 3.1-164 2023-11-27 [4] CRAN (R 4.3.2) #> nloptr 2.0.3 2022-05-26 [1] CRAN (R 4.4.0) #> numDeriv 2016.8-1.1 2019-06-06 [1] CRAN (R 4.4.0) #> parallelly 1.37.1 2024-02-29 [1] CRAN (R 4.4.0) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.4.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.0) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.4.0) #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.4.0) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.4.0) #> rmarkdown 2.26 2024-03-05 [1] CRAN (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.4.0) #> tidyr 1.3.1 2024-01-24 [1] CRAN (R 4.4.0) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.0) #> TMB 1.9.11 2024-04-03 [1] CRAN (R 4.4.0) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.0) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.4.0) #> xfun 0.43 2024-03-25 [1] CRAN (R 4.4.0) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.4.0) #> #> [1] /home/mbojan/R/library/4.4 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

bbolker / broom.mixed

glmmTMB rand_ CIs wrong #108