florianhartig / DHARMa

Diagnostics for HierArchical Regession Models
http://florianhartig.github.io/DHARMa/
201 stars 21 forks source link

Residual interpretation for Poisson regresssion #357

Open florianhartig opened 1 year ago

florianhartig commented 1 year ago

Via email:

[...] I have a qqplot with pearson residuals from a poisson GLM that doesnt look so good (in attachment), and a Kolmorov-Smirnoff test indicated sig non linear distribution. I'm modelling sea turtle captures by longline fishing to produce standardized catch rates to eventually produce estimates of captures for the Portuguese fleet. Besides the non-normal distribution of the residuals, the other model parameters look ok, with dispersion parameter of 1 and a slighly skewed residual distribution that appears negligible. In search of a better model, I've tryed quasipoisson and negative binomial poisson GLM but with worst results. I've also tryed the glmer model which gave me identical results to the poisson GLM even when vessel was added as a random effect (variance=0). I've also tried the glmmPQL and GAMM but with no better results. Looking for a solution, (and the perfect model) I came across the DHARMa package, runned the analysis which made me very happy (plots in attachement). These analysis will be published soon and I want to reference your work yet my concern is that a reviewer could ask me why didnt I use the pearson residuals. I could say that they are limitative but my main question is: if the GLM had a good fit, shouldnt the kolmorov-smirnoff test pointed to a normal distribution of the pearson residuals?

image image
florianhartig commented 1 year ago

Hello,

regarding your questions

1) Pearson residuals in the Poisson will not be normally distributed for low lambda (predicted values). This is well-known and als shown at the start of the DHARMa vignette with simulations, there are also references. Given the look of your Pearson residuals, I suspect you have quite low count rates.

2) So, you should not use Pearson residuals for diagnosing a Poisson regressions, use quantile residuals (as calculated by DHARMa), and those look fine (based on what I see). Of course, there are additional plots you could do, in particular residuals ~ predictor, and even that is not a positive proof that there is no problem, but as it is, there is no evidence for a problem.

3) Irrespective of the residuals, if you have a natural grouping factor (vessel), you should add it as random intercept. If the variance of the RE is estimated to zero, I would personally still keep it in (just because this is the natural design), the error message by lme4 can be ignored, but it probably doesn't make a difference to have it in or out.

Best, Florian