easystats / performance

:muscle: Models' quality and performance metrics (R2, ICC, LOO, AIC, BF, ...)
https://easystats.github.io/performance/
GNU General Public License v3.0
965 stars 87 forks source link

QQ plot blank in check model for glmmTMB with tweedie distribution #688

Closed see24 closed 3 months ago

see24 commented 4 months ago

When I tried to run check_model on a GLMM made with glmmTMB and the tweedie family the plots for Homogeneity of Variance and Normality of Residuals are empty. When I dug into the code a bit I found that for glmmTMB models .diag_qq calls res_ <- stats::residuals(model, type = "deviance") but the glmmTMB documentation says that this only works for some families (not tweedie) and returns NA otherwise. There is a warning that explains this but it is suppressed in check_model. I don't know what type of residuals are needed for a tweedie model but if they can be calculated appropriately that would be great. Otherwise it would be nicer if .diag_qq returned NULL and a message saying these plots are unavailable for this model class and family.

Note in the example below I just used .diag_qq because check_model was really slow but it worked ok for my real data set.

library(performance)
#> Warning: package 'performance' was built under R version 4.3.2

data(sleepstudy, package = "lme4")
m <- glmmTMB::glmmTMB(Reaction ~ Days + (Days | Subject), data = sleepstudy,
                      family = glmmTMB::tweedie)
#> Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
#> problem; false convergence (8). See vignette('troubleshooting'),
#> help('diagnose')

# Not run because takes forever (> 5mins) with this model
# check_model(m, panel = FALSE)
# looks like it is check_predictions step that takes so long
# pred_chk <- check_predictions(m)

performance:::.diag_qq(m)
#> Warning: deviance residuals not defined for family 'tweedie': returning NA
#> [1] x y
#> <0 rows> (or 0-length row.names)

all(is.na(residuals(m, type = "deviance")))
#> Warning: deviance residuals not defined for family 'tweedie': returning NA
#> [1] TRUE

Created on 2024-02-22 with reprex v2.0.2

performance version 0.10.9 glmmTMB version 1.1.8

bbolker commented 4 months ago

For what it's worth, https://github.com/glmmTMB/glmmTMB/issues/293

strengejacke commented 4 months ago

At least, we have a more informative message now:

library(performance)

data(sleepstudy, package = "lme4")
d <- dplyr::sample_n(sleepstudy, 50)
m <- glmmTMB::glmmTMB(Reaction ~ Days,
  data = d,
  family = glmmTMB::tweedie
)
#> Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
#> problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
out <- check_model(m, iterations = 1, verbose = TRUE)
#> Not enough model terms in the conditional part of the model to check for
#>   multicollinearity.
#> QQ plot could not be created. Cannot extract residuals from objects of
#>   class `glmmTMB`. Maybe the model class or the `tweedie` family does not
#>   support the computation of (deviance) residuals?
#> `check_outliers()` does not yet support models of class `glmmTMB`.
out

Created on 2024-03-02 with reprex v2.1.0

strengejacke commented 4 months ago

Not run because takes forever (> 5mins) with this model

Yes, simulate() is very slow for models from tweedie family.

strengejacke commented 3 months ago

Fixed in #643

library(performance)
data(sleepstudy, package = "lme4")
set.seed(123)
d <- sleepstudy[sample.int(50), ]
m <- suppressWarnings(glmmTMB::glmmTMB(Reaction ~ Days,
  data = d,
  family = glmmTMB::tweedie
))
check_model(m, iterations = 2, verbose = TRUE)
#> Not enough model terms in the conditional part of the model to check for
#>   multicollinearity.
#> `check_outliers()` does not yet support models of class `glmmTMB`.

Created on 2024-03-16 with reprex v2.1.0