Closed harell closed 5 years ago
Thank you very much for pointing this out, I will look into this. Looks like a serious bug, but I need to check.
I wasn't able to reproduce this error so far. Two questions:
titatnic
dataset from DALEX package?I encountered a different problem ('male' and 'female' were merged while they weren't supposed to be), but not this one.
I'll create a reproducible example next week.
Attached is the dataset Attached is the code. Change line 7 to point at the dataset
Notice the inconsistency between the print output and plot information.
Thanks, I located the source of the problem. Interpretable features were extracted from the decision rules using regular expressions and with the levels "female" and "male" the "male" pattern was found in two rules instead of one. Once I solve this problem, I will fix this issue.
I fixed this problem in the refactoring branch. When I finish all the changes, I will merge to master.
I'm using the Titanic dataset to explain a random passenger survival rate. To do that I fit a GLM model with the following statistics, ordered by p-value:
We can see that
GENDER
is the most important variable. However, plotting localModel::individual_surrogate_model output doesn't show this variable is important.Here are some comparisons with other explainers:
We see that the other two methods capture and report
GENDER
impact.I think the following print gives a direction for a potential bug. This is the individual_surrogate_model output print. Notice the NA near
GENDER
Does that make any sense?