Closed tmbluth closed 2 years ago
Hi @tmbluth, thanks for pointing out the issue (binary:logitraw was not an option when I originally added XGBoost support). Should be an easy fix (but it might be awhile before I get around to it). Until then you have two workarounds. The easiest is to specify type = "regression"
in the call to partial()
(this tricks it into working):
data(spam, package = "kernlab")
X <- data.matrix(subset(spam, select = -type))
y <- ifelse(spam$type == "spam", 1, 0)
bst <- xgboost(data = X, label = y, max.depth = 3, eta = 0.1, nrounds = 100,
objective = "binary:logitraw")
pdp::partial(bst, pred.var = "charExclamation", train = X, plot = TRUE) # error
pdp::partial(bst, pred.var = "charExclamation", train = X, plot = TRUE, # success
type = "regression")
The other option, which is always more flexible, is to provide your own prediction wrapper via the pred.fun
argument. Examples and details are given in the docs and corresponding R Journal article: https://journal.r-project.org/archive/2017/RJ-2017-016/index.html.
That worked, thank you! Looking forward to the official fix
Related to this issue: https://github.com/bgreenwell/pdp/issues/99.
And also this issue: https://github.com/bgreenwell/pdp/issues/68.
Partial dependence plots should be able to work on continuous predictions whether the output has no bounds or are probabilities between 0 and 1. When I train XGBoost models using objective="binary:logitraw" it returns an error. This model does return predicted probabilities in other use cases but something in
partial()
is not letting this happen.I cannot post company code here but can produce a shell of what part of the code looks like: