florianhartig / DHARMa

Diagnostics for HierArchical Regession Models
http://florianhartig.github.io/DHARMa/
208 stars 22 forks source link

Underdispersion and residuals clustered around 0 #353

Closed sssiv93 closed 1 year ago

sssiv93 commented 1 year ago

Hi Florian,

Thank you for this great package!

Are you able to provide any insights/recommendations as to what might be causing my model diagnostics below?

I am fitting a negative binomial regression on a dataset of 856 entries, with 10 features. It models customers, as a function of company, product type, log(spend) and market demand. I can see underdispersion and a pattern of residuals clustered around 0.00.

Screenshot 2022-11-08 at 17 23 58

For reference, I tried fitting a poisson regression before this and saw significant overdispersion:

Screenshot 2022-11-08 at 17 25 50

florianhartig commented 1 year ago

Hello,

I'm sorry, I thought I had responded but it seems I didn't. It seems that there is a substantial number of data points for low model predictions for which the model under predicts.

image

Looks like a strong case of zero-inflation to me, i.e. you have observations that are zero, or else you get relatively high counts.