When I use data.table(x.interest) instead of x.interest in the example code of iml::Shapley, I receive an error message, i.e.:
library("iml")
library("data.table")
library("rpart")
# First we fit a machine learning model on the Boston housing data
data("Boston", package = "MASS")
rf <- rpart(medv ~ ., data = Boston)
X <- Boston[-which(names(Boston) == "medv")]
mod <- Predictor$new(rf, data = X)
# Then we explain the first instance of the dataset with the Shapley method:
x.interest <- X[1, ]
shapley <- Shapley$new(mod, x.interest = data.table(x.interest))
The error message is:
Error in `[.data.table`(x.interest, setdiff(colnames(x.interest), predictor$data$y.names)) :
When i is a data.table (or character vector), the columns to join by must be specified using 'on=' argument (see ?data.table), by keying x (i.e. sorted, and, marked as sorted, see ?setkey), or by sharing column names between x and i (i.e., a natural join). Keyed joins might have further speed benefits on very large data due to x being sorted in RAM.
I used the latest installations of packages on CRAN.
This is my sessionInfo() output
When I use
data.table(x.interest)
instead ofx.interest
in the example code ofiml::Shapley
, I receive an error message, i.e.:The error message is:
I used the latest installations of packages on CRAN. This is my
sessionInfo()
output