ModelOriented / DALEXtra

Extensions for the DALEX package
https://ModelOriented.github.io/DALEXtra/
65 stars 10 forks source link

add classification type detection to `explain_xgboost` #66

Closed hbaniecki closed 2 years ago

hbaniecki commented 3 years ago

Currently, for xgboost models, the predict functions returns a vector, which means that xgboost for binary classification is treated as a regression.

explain_xgboost could detect classification/regression

maksymiuks commented 2 years ago

Hi @hbaniecki If it is still valid, could you please elaborate on that one, perhaps by providing an example?

Looking at the examples and code xgboost seems to properly distinguish the task as it does not check output format, rather the exact parameter of the model:

code: https://github.com/ModelOriented/DALEXtra/blob/master/R/model_info.R#L199-L206

example

library("xgboost")
library("DALEXtra")
library("mlr")
# 8th column is target that has to be omitted in X data
data <- as.matrix(createDummyFeatures(titanic_imputed[,-8]))
model <- xgboost(data, titanic_imputed$survived, nrounds = 10,
                 params = list(objective = "binary:logistic"),
                 prediction = TRUE)
# explainer with encode functiom
explainer_1 <- explain_xgboost(model, data = titanic_imputed[,-8],
                               titanic_imputed$survived,
                               encode_function = function(data) {
                                   as.matrix(createDummyFeatures(data))
                               })
plot(model_parts(explainer_1))

Let me know if there is any edge case for which it does not holds so I can account for that one.

hbaniecki commented 2 years ago

Agreed, my bad for not providing the potential edge case. I think it works.