Pearson residuals look different than DHARMa residuals for a glmmTMB NB with large number of zeros

florianhartig / DHARMa

Diagnostics for HierArchical Regession Models

212 stars 22 forks source link

From a DHARMa user via email:

I wonder if the qualitative difference between these residual plots using 1) pearson residuals and 2) DHARMa is expected? (81% of the data are zeros: 128 of 158)
   nb2o2 <- glmmTMB (visit_apis ~ 
              b_c + 
              flareasilv * hvs.sum_300m +        
              flarea_manz * flareasilv +  
              flareasilv * org_cnv_spr +                    
              hvs.sum_300m * org_cnv_spr +  
              offset(Lfas) + (1 | site), 
              family = "nbinom2", data = fullz2F)
1) plot(fitted(nb2o2), residuals(nb2o2, type="pearson")) #bbolker script (StackOv 2021)

simulationOutput <- simulateResiduals(fittedModel = nb2o2, plot = T)

The one from DHARMa looks pretty good, while the one with Pearson residuals shows data concentrated down in the left. Which I believed was excess zeros (considering the histogram below), but using a zero inflated model makes no improvement (in AIC; Pearson residuals don't work for zi models..), and DHARMa test for it also showed no problems with zeros (plot below).

It's expected that Pearson residuals can show weird shapes, in particular when we get towards small values of count incidence, where Poisson / NB become strongly assymetric, see beginning of DHARMa vignette. It's a good example for why diagnosing misfit based on Pearson is problematic. You should ignore the pattern in the Pearson residuals.

The DHARMa residuals look OK. The increasing pattern is a bit weird, not sure why this occurs. Maybe plot res ~ predictors to explore.

With a NB, zero-inflation will not necessarily show up in the ZIP test (see comments in the help / vignette), therefore you should additionally test against a model with a ZIP term using a LRT or AIC/BIC. Given that you did this, and there was no improvement, I don't think you have a problem with zero-inflation. The mere fact that you have a lot of zeros doesn't mean that you need a ZIP model, zeros can easily also arise through low predicted incidence.

florianhartig / DHARMa

Pearson residuals look different than DHARMa residuals for a glmmTMB NB with large number of zeros #336