Open nalimilan opened 3 years ago
Thanks for highlighting this.
I can confirm that R does not have this behavior.
The error stems from this line where we work with the pivoted cholesky instead of the original matrix from the original regression.
I haven't tracked down exactly how R is handling it. But there is an interesting warning here
p <- object$rank
p1 <- seq_len(p)
piv <- if(p) qr.lm(object)$pivot[p1]
if(p < ncol(X) && !(missing(newdata) || is.null(newdata)))
warning("prediction from a rank-deficient fit may be misleading")
I can trigger the warning
r$> predict(m1, as.data.frame(x), interval = "confidence")
fit lwr upr
1 1.214286 -0.81513839 3.243710
2 2.857143 -0.01289624 5.727182
3 -2.071429 -5.34377662 1.200919
Warning message:
In predict.lm(m1, as.data.frame(x), interval = "confidence") :
prediction from a rank-deficient fit may be misleading
Thanks. So apparently R doesn't know how to handle it either, so we should probably throw an error (I don't like warnings that tell you we return invalid results).
As noted at https://github.com/JuliaStats/GLM.jl/pull/410,
predict
appears to give incorrect results when some predictors are dropped because of collinearity. We should really do something about this.In the worst case we could throw an error when calling
predict
in such cases until we have a fix.(BTW I'm not sure why the intercept is dropped in this example rather than the last column.)
Cc: @piever @pdeffebach @mkborregaard @dmbates @andreasnoack @Nosferican @DilumAluthge