Closed sdfungayi closed 1 year ago
Considering you are using Data Sampler, your sample is probably skewed (not enough instances with the target value YES). You can inspect this in the Box Plot.
The ratio NO:YES is 75:25 I tried all sampling types in Data Sampler, and also selected the option "Stratify sample"
The workflow is here https://drive.google.com/file/d/1rOagjcaRj7YAv3V6L9lQMei1VBBY1Pl5/view?usp=sharing The dataset (WAFn-UseC-Telco-Customer-Churn.csv) is available here: https://www.kaggle.com/blastchar/telco-customer-churn
I tried again with a different dataset with ratio 65:35 attached herein and got a similar result. customer-churn-data_NM.xlsx
@sdfungayi thank you for reporting the issue. It is definitely a bug in the software or underlying library. The explanation for class YES should be the opposite of the explanation for class NO. It is always that way when the class has only two values.
I will dig deeper into this issue but first I will transfer the issue to the orange3-explain repository which implements widgets.
I noticed that the issue appears in the case with Gradient Boosting but does not appear with other models. While we are solving the issue you can try to explain predictions of Logistic regression for example.
Thank you @PrimozGodec
This was a problem in the SHAP library and should have been fixed with the newest version.
Thanks for looking into it. But I still have an issue with shap==0.40.0. Use heart_disease.tab with Gradient Boosting.
I see. I only checked it for the Iris dataset.
I am running a churn prediction workflow as shown in the attached image. My class of interest is YES
What am I doing wrong?
How do I get make the graphics display properly when the class is set to YES?