Open allicamm opened 4 years ago
Thanks @allicamm ill try to fix this in the next release!
@allicamm Looks like the issue is in plotPartial()
(which relies on lattice graphics and is the default plotting engine whenever plot = TRUE
). However, partial()
and autoplot()
work fine:
library(ggplot2)
library(pdp)
library(xgboost)
trn <- vip::gen_friedman(seed = 101)
X <- data.matrix(subset(trn, select = -y))
y <- trn$y
# Add chyphens to feature names
colnames(X) <- paste0(colnames(X), "-", "test")
# Fit a quick model
fit <- xgboost(X, label = y, nrounds = 50)
# Works
pd <- partial(fit, pred.var = "x1-test", train = X, type = "regression")
# Works
autoplot(pd)
partial(fit, pred.var = "x1-test", train = X, type = "regression", plot = TRUE, plot.engine = "ggplot2")
# Fails
plotPartial(pd)
partial(fit, pred.var = "x1-test", train = X, type = "regression", plot = TRUE) # plot.engine = "lattice" (this is the default)
Might be tough to fix, but I'll work on it soon. Thanks again for pointing out the issue!
Hi Brandon,
I'm having an issue using pdp for a dataset with dashes in variable names. When I run this line of code: partial(model, train = training_final, pred.var = 'marital_status_Married-civ-spouse', plot = TRUE)
It looks like some code in PDP is losing the quotes for this and hence the variable name is getting cut off at the dash:
Error in eval(expr, envir, enclos) : object 'marital_status_Married' not found
Obviously this could be fixed on my end with changing variable names before creating my model, but figured this might be an issue others run into as well.
Thanks!