koalaverse / vip

Variable Importance Plots (VIPs)
https://koalaverse.github.io/vip/
186 stars 24 forks source link

Error: Permutation-based variable importance scores not yet implemented. #53

Closed hanson1005 closed 6 years ago

hanson1005 commented 6 years ago

Hi. I am using vip to measure variable importance scores from my super learner model. And I get the following error message:

install.packages("vip") library(vip) set.seed(150) # for reproducibility sl <- SuperLearner(Y = label, X = train, SL.library=c("SL.randomForest", "SL.glmnet", "SL.svm"), method = "method.NNLS", verbose=TRUE) words <- colnames(train) p1 <- vi(sl, method = "perm", obs = label, feature_names = words) Error: Permutation-based variable importance scores not yet implemented.

What does this error message mean? Also, does vip support super learner models? I would appreciate your help. Thanks!

bgreenwell commented 6 years ago

Hi @hanson1005

The permutation method is only available (and still experimental) on the development version---not the CRAN version. And theoretically, vip can work with ANY model provided you write the appropriate prediction wrapper for it. I was having trouble getting predictions from SuperLearner, but the following works for me:

# Install the latest development version
devtools::install_github("koalaverse/vip")

# Load required packages
library(SuperLearner)  # for SuperLearner algorithm
library(pdp)           # for partial dependence plots
library(vip)           # for variable importance plots

# Boston housing data
X <- subset(boston, select = -cmedv)
Y <- boston$cmedv

# Fit a SuperLearner model
set.seed(150)  # for reproducibility
sl <- SuperLearner(
  Y = Y, X = X, method = "method.NNLS", verbose = TRUE,
  SL.library = "SL.lm"
)

# Prediction wrapper: should return a vector of predictions!!
pfun <- function(object, newdata) {
  predict(object, newdata = newdata)$pred[, 1L, drop = TRUE]
}

# Partial dependence plot (ICE curves)
p1 <- partial(sl, pred.var = "lstat", pred.fun = pfun, plot = TRUE, alpha = 0.1, 
              train = X, progress = "text")

# Permutation-based variable importance
p2 <- vip(
  object = sl, 
  method = "perm", 
  obs = Y,                   # original observations
  feature_names = names(X),  # feature names
  pred_fun = pfun,           # prediction wrapper
  train = X,                 # training features
  metric = "rsquared",       # metric of interest
  progress = "text"          # print progress
)

# Display plots side by side
grid.arrange(p1, p2, ncol = 2)

image