catboost / catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
https://catboost.ai
Apache License 2.0
7.92k stars 1.17k forks source link

Cannot calculate SHAP interaction values with get_feature_importance #1480

Open mosscoder opened 3 years ago

mosscoder commented 3 years ago

Problem: Cannot calculate SHAP interaction values with get_feature_importance catboost version: R 0.23.2 Operating System: OSX Catalina CPU: TRUE

I am unable to calculate ShapInteractionValues, though this functionality is described in the 0.23 release. I am using the R implementation, and receive the following message:

"Error in catboost.get_feature_importance(bestMod, pool = catboost.load_pool(data = jdf[1:10, : catboost/libs/fstr/calc_fstr.cpp:452: Internal CatBoost Error (contact developers for assistance): Inappropriate fstr type ShapInteractionValues"

This is only when providing "ShapInteractionValues" to the type argument. "ShapValues" works without issue.

mosscoder commented 3 years ago
library(catboost)

features <- data.frame(feature1 = c(1, 2, 3), feature2 = as.factor(c('A', 'B', 'C')))
labels <- c(0, 0, 1)
train_pool <- catboost.load_pool(data = features, label = labels)

model <- catboost.train(train_pool,  NULL,
                        params = list(loss_function = 'Logloss',
                                      iterations = 100, metric_period=10))

catboost.get_feature_importance(model,
                                pool = train_pool,
                                type = 'ShapValues')

catboost.get_feature_importance(model,
                                pool = train_pool,
                                type = 'ShapInteractionValues')