pbreheny / visreg

Visualization of regression functions
http://pbreheny.github.io/visreg/
61 stars 18 forks source link

support for models fit without formula objects #61

Open brooksambrose opened 5 years ago

brooksambrose commented 5 years ago

How can we visualize models that were not fit with formula objects?

For example...

aq<-na.omit(airquality)
fit <- randomForest::randomForest(Ozone ~ Solar.R + Wind + Temp, data=aq)
visreg::visreg(fit, "Temp", ylab="Ozone") 

...works as expected, but...

fit <- randomForest::randomForest(y=aq[,'Ozone'],x=aq[,c('Solar.R', 'Wind','Temp')])
visreg::visreg(fit, "Temp", ylab="Ozone")

... throws Error in formula.default(fit) : invalid formula.

pbreheny commented 5 years ago

This is a good question. Currently, visreg only supports a formula interface. It's not clear to me how best to go about supporting a non-formula interface. Historically, at least, the issue was this: suppose we have a model of the form

fitModel(X, y)

where X is a design matrix. There could be multiple columns associated with, say, Wind (spline basis terms, for example). In that case, it wouldn't make any sense to produce a plot in which one of those terms vary and the others remain fixed (i.e., the type of plots produced by visreg). This is something of an insurmountable hurdle for non-formula-based models in terms of working with visreg.

However, I do recognize that there are machine learning methods such as random forests and gradient boosting machines with packages that do not use a formula interface. It would be nice to support them, although I think there would have to be certain caveats about doing so.

In short, I think this would be a nice extension to visreg, but I'll have to give it some more thought before I can do anything about it.