marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.57k stars 1.81k forks source link

IndexError: index 1 is out of bounds for axis 1 with size 1 #428

Closed simplezhang57 closed 4 years ago

simplezhang57 commented 4 years ago

explainer = lime_tabular.RecurrentTabularExplainer(x_train, training_labels=y_train, feature_names=['close'], discretize_continuous=True, class_names=['Falling', 'Rising'], )

exp = explainer.explain_instance(x_cv[1], model.predict, num_features=1 ,labels=(1,)) exp.show_in_notebook()

x_train.shape = (444, 9, 1) y_train.shape = (444, 1) x_cv.shape = (151, 9, 1) y_cv.shape = (151, 1) c:\users\psdz\appdata\local\programs\python\python36\lib\site-packages\lime\lime_base.py in explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor) 181 182 weights = self.kernel_fn(distances) --> 183 labels_column = neighborhood_labels[:, label] 184 used_features = self.feature_selection(neighborhood_data, 185 labels_column,

IndexError: index 1 is out of bounds for axis 1 with size 1

anyone help me out,thx

antonkanerva commented 4 years ago

I have the same issue with the LimeTextExplainer. I'm classifying text embeddings (IMDB dataset) and have written a wrapper around the Keras LSTM's predict function (suggested here). I intend to use LIME in my master's thesis, please help!

def get_embedding(word_id_dict, input_data, maxlen):
    word_id_dict = {k:(v+INDEX_FROM) for k,v in word_id_dict.items()}
    word_id_dict["<PAD>"] = 0
    word_id_dict["<START>"] = 1
    word_id_dict["<UNK>"] = 2

    tokens = text_to_word_sequence(input_data)
    embedding_array = [word_id_dict[token] for token in tokens]
    return embedding_array

def new_predict(texts):
    embeddings = np.array([get_embedding(word_to_id, text, maxlen) for text in texts])
    embeddings = sequence.pad_sequences(embeddings, maxlen=maxlen)
    return model.predict_proba(embeddings)

explainer = LimeTextExplainer(class_names=["negative", "positive"])
text_instance = get_raw_txt(word_to_id, x_test[1])
exp = explainer.explain_instance(text_instance, new_predict)

IndexError Traceback (most recent call last)

in 3 text_instance = get_raw_txt(word_to_id, x_test[1]) ----> 4 exp = explainer.explain_instance(text_instance, new_predict) ~/.local/lib/python3.6/site-packages/lime/lime_text.py in explain_instance(self, text_instance, classifier_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor) 432 data, yss, distances, label, num_features, 433 model_regressor=model_regressor, --> 434 feature_selection=self.feature_selection) 435 return ret_exp 436 ~/.local/lib/python3.6/site-packages/lime/lime_base.py in explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor) 181 182 weights = self.kernel_fn(distances) --> 183 labels_column = neighborhood_labels[:, label] 184 used_features = self.feature_selection(neighborhood_data, 185 labels_column, IndexError: index 1 is out of bounds for axis 1 with size 1
marcotcr commented 4 years ago

exp = explainer.explain_instance(x_cv[1], model.predict, num_features=1 ,labels=(1,)) Your prediction function should output an array of prediction probabilities, I'm guessing it outputs a single number (the label). Use model.predict_proba, or set labels=(0,)

@antonkanerva, I'm guessing your predict_proba function is only returning P(1), same problem.

ExploitedRoutine commented 2 years ago

Sorry for opening up the discussion here again. I have run my LIME code on another model using the following code without hassle. Yet, here I am not sure why the error occurs. I have compared the two models with each other but can't find a major difference. Any help is greatly appreciated to find out what this error means in my case. Can sadly not reverse engineer it by myself.


IndexError Traceback (most recent call last) Input In [48], in <cell line: 3>() 1 # get explanation for specific instance ----> 3 exp = explainer.explain_instance(X_test[2], model.predict, num_features=10)

File D:\Programme\Anaconda\envs\Masterarbeit\lib\site-packages\lime\lime_tabular.py:452, in LimeTabularExplainer.explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor) 448 labels = [0] 449 for label in labels: 450 (ret_exp.intercept[label], 451 ret_exp.local_exp[label], --> 452 ret_exp.score, ret_exp.local_pred) = self.base.explain_instance_with_data( 453 scaled_data, 454 yss, 455 distances, 456 label, 457 num_features, 458 model_regressor=model_regressor, 459 feature_selection=self.feature_selection) 461 if self.mode == "regression": 462 ret_exp.intercept[1] = ret_exp.intercept[0]

File D:\Programme\Anaconda\envs\Masterarbeit\lib\site-packages\lime\lime_base.py:182, in LimeBase.explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor) 145 """Takes perturbed data, labels and distances, returns explanation. 146 147 Args: (...) 178 local_pred is the prediction of the explanation model on the original instance 179 """ 181 weights = self.kernel_fn(distances) --> 182 labels_column = neighborhood_labels[:, label] 183 used_features = self.feature_selection(neighborhood_data, 184 labels_column, 185 weights, 186 num_features, 187 feature_selection) 188 if model_regressor is None:

IndexError: index 1 is out of bounds for axis 1 with size 1


For further context, the following error message parts are highlighted:

self.base.explain_instance_with_data( 453 scaled_data, 454 yss, 455 distances, 456 label, 457 num_features, 458 model_regressor=model_regressor, 459 feature_selection=self.feature_selection)

and

--> 182 labels_column = neighborhood_labels[:, label]

Nour-Aldein2 commented 1 year ago

If you are using TensorFlow with binary classification, it won't work if you use an output layer with one neuron and sigmoid as an activation function. You should use two neurons and softmax.