marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.55k stars 1.8k forks source link

KeyError: 4 #635

Open Mjuve360 opened 3 years ago

Mjuve360 commented 3 years ago

Dear @marcotcr Im using a two class data set with 6 features. everything properly works except this block of code:

i = np.random.randint(0, X_test.shape[0]) exp = explainer.explain_instance(X_test[i], rf.predict_proba, num_features=6, top_labels=1)

and the error is not understandable KeyError Traceback (most recent call last)

in 1 i = np.random.randint(0, X_test.shape[0]) ----> 2 exp = explainer.explain_instance(X_test[i], rf.predict_proba, num_features=6, top_labels=1) /Volumes/Data/opt/anaconda3/envs/TensorFlow_env/lib/python3.7/site-packages/lime/lime_tabular.py in explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor) 338 # Preventative code: if sparse, convert to csr format if not in csr format already 339 data_row = data_row.tocsr() --> 340 data, inverse = self.__data_inverse(data_row, num_samples) 341 if sp.sparse.issparse(data): 342 # Note in sparse case we don't subtract mean since data would become dense /Volumes/Data/opt/anaconda3/envs/TensorFlow_env/lib/python3.7/site-packages/lime/lime_tabular.py in __data_inverse(self, data_row, num_samples) 538 inverse = data.copy() 539 for column in categorical_features: --> 540 values = self.feature_values[column] 541 freqs = self.feature_frequencies[column] 542 inverse_column = self.random_state.choice(values, size=num_samples, KeyError: 4 Would you please help me?
marcotcr commented 3 years ago

Can you share the lines where you instantiate the explainer? It looks as if X_test has a different shape than whatever you use to start the tabular explainer.

Elektriman commented 7 months ago

Other people have had the same issue (me included). It comes from a previous line in LimetabularExplainer.__data_inverse where categorical_features is overridden like so : categorical_features = range(num_cols) line 508. This happens even when you have specifically set categorical_features to an empty list at instanciation of the object.

Elektriman commented 7 months ago

This may happen if the training data you give to the LimeTabularExplainer has $n$ columns but the row you want to explain has $n+1$ columns because you forgot to remove the target column