marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.64k stars 1.81k forks source link

Lime is only using the first probability to generate results #628

Closed osiast closed 3 years ago

osiast commented 3 years ago

I studied Lime and implemented it in my pytorch model. However, I noticed something strange: As seen in the image, Lime is only using the first probability (of the 5000 generated by default) to generate the graph. I'm confused, why does this happen? Is it correct?

feature_names = list(X_train_fold[0].head(0))

explainer = LimeTabularExplainer(X_train_fold[0],
                                 feature_names=feature_names,
                                 class_names=['is_recid'],
                                 discretize_continuous=False, 
                                 mode='classification')

i = np.random.randint(0, X_test_fold[0].shape[0])
data_row = np.array(X_train_fold[0].values[i])

exp = explainer.explain_instance(data_row, my_predict_proba, num_features=5, top_labels=1)

exp.show_in_notebook(show_table=True, show_all=False)

image

marcotcr commented 3 years ago

This is the expected behavior. We are showing the model prediction probabilities on the data_row you want to explain. The explanation is supposed to summarize the behavior on the other 4,999 perturbed samples.