florianhartig / DHARMa

Diagnostics for HierArchical Regession Models
http://florianhartig.github.io/DHARMa/
200 stars 21 forks source link

Extracted Pearson residuals wrong for mccv (type should be "scaled.pearson") #417

Closed florianhartig closed 4 weeks ago

florianhartig commented 4 weeks ago

Through #415, I noted an issue with the Pearson Chi2 test for gam models: in mgcv, what I would define as standard Pearson residuals = raw/expectedSD is called "scaled.pearson", whereas type = "pearson" is additionally multiplied with the scale parameter. This leads to a bias in the parametric option PearsonChisq for testDispersion.

f2 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 10 *
  (10 * x)^3 * (1 - x)^10
n <- 3000
x <- runif(n)
mu <- exp(f2(x)/3+.1);x <- x*10 - 4
y <- rTweedie(mu,p=1.5,phi=1.3)
b <- gam(y~s(x,k=20),family=Tweedie(p=1.3))

res = simulateResiduals(b, plot = F)
plot(res, quantreg = F)

# correct model, so mean sd on Pearson residuals should be 1
x = residuals(b, type = "response")
sd(x)
x = residuals(b, type = "scaled.pearson")
sd(x)

testDispersion(res)
# Pearson chi2 test biased
testDispersion(res, type = "PearsonChisq")