marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.55k stars 1.8k forks source link

How to prevent TabularExplainer accessing impossible value? #621

Closed elcolie closed 3 years ago

elcolie commented 3 years ago

Here is my valid value in the features

feature_1        [2, 3, 4, 5]
feature_2        [2, 3]
feature_3        [2, 3]
feature_4        [2, 3]
feature_5        [2, 3]
feature_6        [2, 3, 4]
feature_7        [2, 3]

But TabularExplainer is trying to access the impossible vector

import lime
import lime.lime_tabular
explainer = lime.lime_tabular.LimeTabularExplainer(
    X_train.to_numpy(), 
    feature_names=tubing_state_features, 
    class_names=tubing_state_classes, 
    discretize_continuous=True
)
exp = explainer.explain_instance(
    X_test.iloc[1].to_numpy(), 
    lime.predict_proba, 
    num_features=2, 
    top_labels=1
)

One of the vector that LIME trying to access is

feature_1        [4.0]
feature_2        [0.0] This is impossible
feature_3        [3.0]
feature_4        [3.0]
feature_5        [0.0] This is impossible
feature_6        [4.0]
feature_7        [3.0]

I am not sure that this is a behavior by design or not if so

Suppose my classifier can not predict any value outside the options I provided

How can I use LIME with this case?

elcolie commented 3 years ago

Follow this question I can solve this case.