Closed simplezhang57 closed 4 years ago
I have the same issue with the LimeTextExplainer. I'm classifying text embeddings (IMDB dataset) and have written a wrapper around the Keras LSTM's predict function (suggested here). I intend to use LIME in my master's thesis, please help!
def get_embedding(word_id_dict, input_data, maxlen):
word_id_dict = {k:(v+INDEX_FROM) for k,v in word_id_dict.items()}
word_id_dict["<PAD>"] = 0
word_id_dict["<START>"] = 1
word_id_dict["<UNK>"] = 2
tokens = text_to_word_sequence(input_data)
embedding_array = [word_id_dict[token] for token in tokens]
return embedding_array
def new_predict(texts):
embeddings = np.array([get_embedding(word_to_id, text, maxlen) for text in texts])
embeddings = sequence.pad_sequences(embeddings, maxlen=maxlen)
return model.predict_proba(embeddings)
explainer = LimeTextExplainer(class_names=["negative", "positive"])
text_instance = get_raw_txt(word_to_id, x_test[1])
exp = explainer.explain_instance(text_instance, new_predict)
IndexError Traceback (most recent call last)
exp = explainer.explain_instance(x_cv[1], model.predict, num_features=1 ,labels=(1,))
Your prediction function should output an array of prediction probabilities, I'm guessing it outputs a single number (the label). Use model.predict_proba
, or set labels=(0,)
@antonkanerva, I'm guessing your predict_proba function is only returning P(1), same problem.
Sorry for opening up the discussion here again. I have run my LIME code on another model using the following code without hassle. Yet, here I am not sure why the error occurs. I have compared the two models with each other but can't find a major difference. Any help is greatly appreciated to find out what this error means in my case. Can sadly not reverse engineer it by myself.
IndexError Traceback (most recent call last) Input In [48], in <cell line: 3>() 1 # get explanation for specific instance ----> 3 exp = explainer.explain_instance(X_test[2], model.predict, num_features=10)
File D:\Programme\Anaconda\envs\Masterarbeit\lib\site-packages\lime\lime_tabular.py:452, in LimeTabularExplainer.explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor) 448 labels = [0] 449 for label in labels: 450 (ret_exp.intercept[label], 451 ret_exp.local_exp[label], --> 452 ret_exp.score, ret_exp.local_pred) = self.base.explain_instance_with_data( 453 scaled_data, 454 yss, 455 distances, 456 label, 457 num_features, 458 model_regressor=model_regressor, 459 feature_selection=self.feature_selection) 461 if self.mode == "regression": 462 ret_exp.intercept[1] = ret_exp.intercept[0]
File D:\Programme\Anaconda\envs\Masterarbeit\lib\site-packages\lime\lime_base.py:182, in LimeBase.explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor) 145 """Takes perturbed data, labels and distances, returns explanation. 146 147 Args: (...) 178 local_pred is the prediction of the explanation model on the original instance 179 """ 181 weights = self.kernel_fn(distances) --> 182 labels_column = neighborhood_labels[:, label] 183 used_features = self.feature_selection(neighborhood_data, 184 labels_column, 185 weights, 186 num_features, 187 feature_selection) 188 if model_regressor is None:
IndexError: index 1 is out of bounds for axis 1 with size 1
For further context, the following error message parts are highlighted:
self.base.explain_instance_with_data( 453 scaled_data, 454 yss, 455 distances, 456 label, 457 num_features, 458 model_regressor=model_regressor, 459 feature_selection=self.feature_selection)
and
--> 182 labels_column = neighborhood_labels[:, label]
If you are using TensorFlow with binary classification, it won't work if you use an output layer with one neuron and sigmoid
as an activation function. You should use two neurons and softmax
.
explainer = lime_tabular.RecurrentTabularExplainer(x_train, training_labels=y_train, feature_names=['close'], discretize_continuous=True, class_names=['Falling', 'Rising'], )
exp = explainer.explain_instance(x_cv[1], model.predict, num_features=1 ,labels=(1,)) exp.show_in_notebook()
x_train.shape = (444, 9, 1) y_train.shape = (444, 1) x_cv.shape = (151, 9, 1) y_cv.shape = (151, 1) c:\users\psdz\appdata\local\programs\python\python36\lib\site-packages\lime\lime_base.py in explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor) 181 182 weights = self.kernel_fn(distances) --> 183 labels_column = neighborhood_labels[:, label] 184 used_features = self.feature_selection(neighborhood_data, 185 labels_column,
IndexError: index 1 is out of bounds for axis 1 with size 1
anyone help me out,thx