mmschlk / shapiq

Shapley Interactions for Machine Learning
https://shapiq.readthedocs.io
MIT License
222 stars 12 forks source link

Xgb bugfix #267

Closed mmschlk closed 3 weeks ago

mmschlk commented 3 weeks ago

TLDR: This PR fixes #250, adds tests with xgb models and finds a bug/inconsitency in shap and xgboost.sklearn.XGBClassifier that is not present in shapiq.

Bugfix of #250.

The bug that the baseline prediction was not properly set stems from the fact that xgboost models (note models and not the individual boosters) contain an model.base_score and/or model.intercept_ attributes that store the empty prediction of the xgb models (as log-odds). Now this base_score/intercept is added to the values of the xgb model

Uncovers a bug in shap (not in shapiq)

The test_tree_explainer.test_xgboost_shap_error. contains a test uncovering some inconsistencies with shap: The test is used to show that the shapiq implementation is correct and the shap implementation is doing something weird. For some instances (e.g. the one used in this test) the SHAP values are different from the shapiq values. However, when we round the thresholds of the xgboost trees in shapiq, then the computed explanations match. This is a strange behavior as rounding the thresholds makes the model less true to the original model but only then the explanations match.