pbreheny / visreg

Visualization of regression functions
http://pbreheny.github.io/visreg/
61 stars 18 forks source link

Support for fitted models from "rms" package #9

Closed felixhaass closed 9 years ago

felixhaass commented 9 years ago

First off, let me thank you for visreg which is a great package and extremely useful!

I am currently using the lrm function from rms package (mostly because of its ability to calculate cluster-robust standard errors) for logistic regression. Unfortunately, I can't get visreg to work with the fitted model output from lrm functions.

Here is the error message I get:

Error in model.frame.default(Terms.nooff, newdata, na.action = na.action,  : 
  Variablenlängen sind unterschiedlich (gefunden für '(level)')

It's in German, but it basically says that model.frame.default() is complaining about different variable length. However, model.frame(fitted_lrm_model) returns without a problem and so does formula(fitted_lrm_model). I can't quite figure out where the problem occurs.

Here some code to reproduce the error:

data("birthwt",package="MASS")

birthwt$race <- factor(birthwt$race,labels=c("White","Black","Other"))
birthwt$smoke <- factor(birthwt$smoke,labels=c("Nonsmoker","Smoker"))

library("rms")
fit_rms <- lrm(low ~ age + race + smoke + lwt, data = birthwt, x = T, y = T)

visreg(fit_rms)

Any ideas?

pbreheny commented 9 years ago

Hi Felix,

I'm happy to hear you're finding visreg useful.

Thanks for bringing this issue to my attention. It turns out that the fundamental issue is this: visreg calls predict on the model:

p <- predict(fit)

and plots its output. visreg expects the linear predictions to be located in p$fit, which is where lm, glm, and lots of other predict functions put them. The rms package, however, puts them in p$linear.predictors. This is simple to fix; I added the following line to the visreg package:

if ("rms" %in% class(fit)) p$fit <- p$linear.predictors

So your code now works for the github version of visreg; I haven't released this as an official CRAN package update yet, though. Can you install from github and verify that this fixes your issue?

felixhaass commented 9 years ago

Thanks for the quick reply and the quick fix! This indeed solves the issue with lrm; the github version of visreg now successfully plots all my lrm models.

However, I tried it with the ols function, also from the rms package (for fitting linear models) which produced the following error message:

Error in data.frame(xy[[j]]$x$D, visregRes = xy[[j]]$y$r, visregPos = xy[[j]]$y$pos) : 
  arguments imply differing number of rows: 193, 196
In addition: Warning messages:
1: In Response(fit, x, trans, alpha, ...) :
  Residuals do not match data; have you changed the original data set?  If so, visreg is probably not displaying the residuals for the data set that was actually used to fit the model.
2: In suppressWarnings(do.call("predict", predict.args)) + rr :
  longer object length is not a multiple of shorter object length

I'm presuming this comes from missing data in my dataset which is not handled correctly. But I haven't been able to reproduce the error with the birthwt dataset, using ols instead of lrm. I'll investigate further and report back. Again, thanks for the quick fix.

Edit: I have been able to reproduce the error by creating missing data in birthwt. Here is the code:

library(rms)
# create some NAs
birthwt[2:4, "age"] <- NA

fit_rms <- ols(low ~ age + race + smoke + lwt, data = birthwt, x = T, y = T)

visreg(fit_rms)

This only throws an error for ols though, not for lrm.

felixhaass commented 9 years ago

I've also come across another error message. When I use the scale = "response" option for using visreg with the fitted lrm model, I get this:

> visreg(fit_rms, "age", scale = "response")
Error in Response(fit, x, trans, alpha, ...) : 
  could not find function "trans"
pbreheny commented 9 years ago

All of these errors are caused by the rms package doing things in a nonstandard way. For the missing data issue:

> fit_lm <- lm(low ~ age, data=birthwt)
> fit_rms <- ols(low ~ age, data=birthwt)
> head(residuals(fit_lm))
        85         89         91         92         93         94 
-0.3609978 -0.3713876 -0.3402182 -0.3298285 -0.3817774 -0.2570999 
> head(residuals(fit_rms))
        85         86         87         88         89         91 
-0.3609978         NA         NA         NA -0.3713876 -0.3402182

I.e., rms returns NAs for missing data residuals, while other R model fitting functions just exclude them. Again, this is straightforward to fix in visreg; I added a !is.na(r) clause to the output from the call to residuals and this should work now for the github version of visreg.

felixhaass commented 9 years ago

Great, thanks a lot for the quick fix! This solves the issue.

pbreheny commented 9 years ago

scale="response" issue: again, most R functions like lm/glm contain information within the fit describing how to transform back to the response scale; lrm does not, which is why there is a missing trans function. You can supply this information yourself with the trans option, although I guess I can make an exception and specifically supply this for lrm models since we know it's logistic regression....

Anyway, again, scale="response" should work now for lrm models.

Thanks for bringing all these issues to my attention, Felix -- visreg works "naturally" with R packages if they follow the general organization of how lm/glm work, but if they do things in a nonstandard way, special exceptions have to be added. It would take me an eternity to find these nonstandard things myself.