Closed EoinKenny closed 6 years ago
Nevermind, I fixed it.
The issue is that keras return a 2D array and LIME wants a 1D one. Simply make a helper function and replace it with model.predict in the LIME pipeline.
@EoinKenny Could you share the code?
Sure thing!
You might have to adjust one or two things though, I haven't tested this pipeline specifically.
model = load_model('keras_model.h5')
# Previously loaded pandas df
qc = df.as_matrix()[1]
qc_reshape = qc.reshape(1,-1)
def predict(qc):
global model
qc = model.predict(qc)
return qc.reshape(qc.shape[0])
import lime
import lime.lime_tabular
import pandas as pd
explainer = lime.lime_tabular.LimeTabularExplainer(df.as_matrix(),
feature_names=df.columns,
class_names=['Price'],
verbose = True,
mode='regression')
exp = explainer.explain_instance(qc, predict, num_features=len(df.columns))
Thank you @EoinKenny
I used following.
model
def flatten_predict(input):
return model.predict(input).flatten()
I just want to update this for classification issues, it requires a probability both positive and negative.
` def flatten_predict(i): global model
predictions = model.predict_proba(i)
x = np.zeros((predictions.shape[0], 1))
probability = (x + 1) - predictions
final = np.append(predictions, probability, axis=1)
return final
`
No problem! I think your solution looks cleaner. I just wanted to update this for classification since it requires it in sklearn format.
:cupid:
`def flatten_predict(i): global nn_no_bias_clf
predictions = nn_bias_clf.predict_proba(i)
x = np.zeros((predictions.shape[0], 1))
probability = (x + 1) - predictions
final = np.append(predictions, probability, axis=1)
return final`
Hello, I ran into the same issue Eoin ran into (using Keras regression, last layer Dense(1)) and tried using your flatten_predict() function, which solve the original error, but I now get this error:
ValueError: Found input variables with inconsistent numbers of samples: [5000, 180000]
It seems the explainer is truncating samples at 5000. Does anyone have a suggestion on how to solve this issue?
Thanks!
I just checked my previous implementation for regression and I simply used the function as
def flatten_predict(qc):
global model
qc = model.predict(qc)
return qc.reshape(qc.shape[0])
Can't think why it wouldn't work, sorry.
Thanks Eoin! That helped. I'm now trying to get Lime to work with my Keras RNN model (LSTM --> Dense(3))
I'm using the RecurrentTabularExplainer. When I run the explainer I get this error:
---> 11 return qc.reshape(qc.shape[0]) ValueError: cannot reshape array of size 180000 into shape (5000,)
Here is the code I'm using:
`explainer = lime.lime_tabular.RecurrentTabularExplainer(batched_train_x, feature_names=x_features_list, class_names=['vsd'], categorical_features=None, verbose=True, mode="regression")
exp = explainer.explain_instance(batched_test_x[i:i+1,:,:], flatten_predict, num_features=len(x_features_list))`
data is in (batch, time steps, features): batched_train_x.shape = (4,12,228) batched_test_x.shape = (1, 12, 228)
Do you have any suggestions on how to get this issue solved? Thanks again for your help.
Same problem here, i'm trying to do a XGBRegressor multi-output regression (y1,y2,y3,y4...y30) , with 531 features and locally explain each Y forecast with LIME. I tried
explainer = lime.lime_tabular.LimeTabularExplainer(X_train, training_labels = y_features_list, feature_names=feature_list, verbose=True, mode='regression')
def predict(test): global multioutputregressor test = multioutputregressor.predict(test) return test.reshape(test.shape[0])
exp = explainer.explain_instance(data_row = X_test[0,:], predict_fn = predict, num_features=10)
But i get
ValueError: cannot reshape array of size 150000 into shape (5000,)
Thanks!
Hi @EoinKenny, I am linking to Lime issue https://github.com/marcotcr/lime/issues/376 I tried your solution
Keras_model.predict_proba(testData_for_model)
Out[73]: array([[0.6559619]], dtype=float32)
def predictKeras(testData_for_model):
prediction_Class_1 = Keras_model.predict_proba(testData_for_model)
x = numpy.zeros((prediction_Class_1.shape[0], 1))
probability = (x + 1) - prediction_Class_1
final = numpy.append(probability,prediction_Class_1, axis=1)
return final
The output of final is
final
Out[71]: array([[0.3440381, 0.6559619]])
Then I call
keras_explainer = lime.lime_tabular.LimeTabularExplainer(input_x,
mode='classification',
feature_names=feature_names,
kernel_width=5,
random_state=42,
discretize_continuous=False)
test_for_explainer = testData_for_model.reshape(testData_for_model.shape[1],)
exp = keras_explainer.explain_instance(test_for_explainer, predictKeras, num_features = 10)
Its Working Good
One question is, keras returns probability for Class 1, how should I mention class names in explainer, as exp.class_names return 0
exp.class_names
Out[85]: ['0']
I train a model like so
model = Sequential() model.add(Dense(200, input_dim=11, kernel_initializer='normal', activation='relu')) model.add(Dropout(0.3)) model.add(Dense(200, activation='relu')) model.add(Dropout(0.3)) model.add(Dense(200, activation='relu')) model.add(Dropout(0.3)) model.add(Dense(1, activation='relu')) model.compile(loss='mean_squared_error', optimizer='adam') # Fit the model model.fit(X_train, y_train, epochs=200, batch_size=512)`
Then I try to make a LIME prediction...
`import lime import lime.lime_tabular import pandas as pd
explainer = lime.lime_tabular.LimeTabularExplainer(df.as_matrix(), feature_names=df.columns, class_names=['Price'], verbose=True, mode='regression')
exp = explainer.explain_instance(qc_reshape[0], model.predict, num_features=len(df.columns))
exp.show_in_notebook(show_table=True)
exp.as_list()`
After that I get the error...
`AssertionError Traceback (most recent call last) /anaconda3/envs/LIME/lib/python3.6/site-packages/lime/lime_tabular.py in explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor) 300 try: --> 301 assert isinstance(yss, np.ndarray) and len(yss.shape) == 1 302 except AssertionError:
AssertionError:
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)