TeamHG-Memex / eli5

A library for debugging/inspecting machine learning classifiers and explaining their predictions
http://eli5.readthedocs.io
MIT License
2.74k stars 332 forks source link

LightGbm explain_prediction results correct? #340

Open ErnstDinkelmann opened 4 years ago

ErnstDinkelmann commented 4 years ago

Hi

Single-Class (churn) classification problem, about 500 features, trained with LightGbm. Want to use explain_prediction to get insight into how decisions are made for particular features.

What I've done, sampled 1000 cases from the training set. Recorded the results of the explain_prediction function for these 1000 cases. Then only consider a single feature and plot the 1000 samples with: x-axis: feature_value y-axis: eli5.explain_prediction value (retrieved from eli5.explain_predictions.targets[0].feature_weights.pos.weight and ...neg.weight)

Here is an example of a particular feature, with the 1000 samples plotted: image

The issue: The result is not intuitive. I know (from partial dependency plots and class1-prevelence plots) that having values to the left of the x-axis, should have higher probability of churn (class1) than values to the right on the x-axis. This is not bourne out here.

It's possible that I'm interpreting the result incorrectly such that it cannot be used like this. My interpretation is broadly that the eli5.explain_prediction result should show whether, for a particular feature and value, the probability of class 1 (churn) is increased (a figure) or decreased (a negative figure).

germayneng commented 4 years ago

not a solution directly but seems like shap will help you in this area: https://github.com/slundberg/shap