pbreheny / visreg

Visualization of regression functions
http://pbreheny.github.io/visreg/
61 stars 18 forks source link

definition of partial residuals for GLMs #27

Closed rdiaz02 closed 7 years ago

rdiaz02 commented 7 years ago

I am not sure I understand why visreg uses deviance residuals to construct partial residuals (the document "Visualization of Regression Models Using visreg", on p. 11, says "The partial residuals are calculated based on Equation 2.2, with r the deviance residuals (the default residuals returned by residuals.glm).")

However, it is my understanding that, in GLMs, partial residuals are generally computed using working residuals, not deviance residuals. For instance, see McCullagh and Nelder, "Generalized Linear Models" (2nd ed) p. 402, or Cook and Croos-Dabrera, 1998, "Partial Residual Plots in Generalized Linear Models", JASA, 93, pp. 731 and 732 (where they actually say how their definition of partial residual agrees with that of several other authors) or Hardin and Hilbe, 2007, "Generalized Linear Models and Extensions, 2nd ed", p. 54 (here is a google books link to that page). The partial residuals that R returns when using residuals(fit, type = "partial") use working residuals I think (the help for glm.summaries says "The partial residuals are a matrix of working residuals, with each column formed by omitting a term from the model"). Yes, Wood, 2006, in "Generalized additive models", p. 222, says of partial residuals "These are simply the Pearson residuals added to the smoothed terms evaluated at the appropriate covariate values".

Does this matter a lot? I do not know. I am not sure if partial residual plots where partial residuals are constructed from deviance residuals are expected to look linear when the relationship really is linear, etc. Plots certainly can look slightly different as the code below shows for a simple example from a Poisson GLM.

Regardless, maybe explaining why deviance residuals are used, and a reference, would help here?

n <- 1000

x1 <- runif(n, 0, 1)
x2 <- runif(n, 2, 3)
lp <- .1 +  x1 + 2 * x2
y <- rpois(n, lambda = exp(lp))
(fp <- glm(y ~ x1 + x2, family=poisson))
summary(fp)

fpr <- residuals(fp, type = "partial")

visreg(fp, "x1", partial = TRUE)
plot(fpr[, 1] ~ x1)
abline(lm(fpr[, 1] ~ x1))
lm(fpr[, 1] ~ x1)

visreg(fp, "x2", partial = TRUE)
plot(fpr[, 2] ~ x2)
abline(lm(fpr[, 2] ~ x2))
lm(fpr[, 2] ~ x2)
pbreheny commented 7 years ago

Deviance residuals are symmetric; other types of residuals are not. Slightly modifying your example,

n <- 1000
x <- runif(n, -5, 5)
y <- rpois(n, lambda = exp(x))
y[which.max(x)] <- 0
fit <- glm(y ~ x, family=poisson)
fpr <- residuals(fit, type = "partial")

visreg(fit, "x", partial = TRUE)
plot(fpr[, 1] ~ x)
abline(lm(fpr[, 1] ~ x))
lm(fpr[, 1] ~ x)

visreg

The visreg plot correctly illustrates what is happening: the data actually follows the Poisson distribution and there are no outliers except for one large negative outlier at the high end of x. The other plot is completely useless. Not only does it give the misleading impression that there are a several huge outliers at low values of x, it completely obscures the only actual outlier.

rdiaz02 commented 7 years ago

I am not saying using deviance residuals is a bad idea; my points is that your definition of partial residuals seems to me to be a departure from what is done in the literature. Using working residuals has a justification (see, e.g., the paper by Cook and Croos-Dabrera) when the purpose is to identify the need to transform covariates and partial residuals (using working residuals) have been extended (e.g., CERES plots) to account for different forms of dependencies between the covariates, etc.

I could not find any reference or argument in visreg's documentation for why partial residuals using deviance residuals is a good/better idea. Is visreg's working paper also a way to introduce partial residuals based on deviance residuals instead of working residuals?

Your code is an interesting example, but I am not sure it is entirely convincing, since the point of partial residual plots is not to mainly identify outliers, but to check the scales of the covariates and the need for possible transformations of the covariates.

As another example, see the following code, where the true dependence is on log(z) (not z). For detecting that we need to use log(z) instead of z, visreg's plot is certainly much more dramatic. But the plot using the "orthodox" partial residuals (with the overimposed black smooth line) also shows the bending of partial residuals against z. A possible issue with visreg's plot is that, for the wrong model, the plot against x shows a pattern that might be misleading.

n <- 1000
x <- runif(n, -5, 5)
z <- runif(n, 1, 50)
lz <- log(z)
y <- rpois(n, lambda = exp(x +  lz))

(f1 <- glm(y ~ x + z, family=poisson))
(f2 <- glm(y ~ x + lz, family=poisson))
pr1 <- residuals(f1, type = "partial")
pr2 <- residuals(f2, type = "partial")

## Incorrect model
par(mfrow = c(2, 2))
visreg(f1, "x", partial = TRUE)
visreg(f1, "z", partial = TRUE)
scatter.smooth(pr1[, 1] ~ x, col = "blue", cex = 0.2)
abline(lm(pr1[, 1] ~ x), lty = 2, col = "grey")
scatter.smooth(pr1[, 2] ~ z, col = "blue", cex = 0.2)
abline(lm(pr1[, 2] ~ z), lty = 2, col = "grey")

## Correct model
visreg(f2, "x", partial = TRUE)
scatter.smooth(pr2[, 1] ~ x, col = "blue", cex = 0.2)
abline(lm(pr2[, 1] ~ x), lty = 2, col = "grey")
visreg(f2, "lz", partial = TRUE)
scatter.smooth(pr2[, 2] ~ lz, col = "blue", cex = 0.2)
abline(lm(pr2[, 2] ~ lz), lty = 2, col = "grey")

## Can also directly plot partial residuals (using working residuals) as:
library(car)
crPlots(f1)
crPlots(f2)
pbreheny commented 7 years ago

So I agree with you that in f1, the wrong model, the plot against x makes it appear as though something is wrong with the model even though the functional form of x is correctly specified. Two questions/comments, though:

  1. Something is wrong with the model. It's true that you can't fix it by changing the functional form of x, but I don't see that it's necessarily a bad thing that the plot reveals a problem with the model.
  2. Furthermore, the visreg plot suggests something is wrong with the model, but it doesn't indicate that the functional form of x is incorrect, so I don't think it would lead you in the wrong direction.
  3. Even if it were desirable that the plot vs x looks perfect and only the plot vs z looks bad, I don't see how any of the other plots you offer solve this problem. They all look to have problems with x as well. Perhaps I'm missing something?

As I said in a comment on a different issue, I'm currently working on a revision of the manuscript, so I'll see if I can provide any additional comments about deviance residuals, although I'm not sure how much space the journal will let me have on this point.