ModelOriented / DALEX

moDel Agnostic Language for Exploration and eXplanation
https://dalex.drwhy.ai
GNU General Public License v3.0
1.36k stars 165 forks source link

FNN : ValueError: Wrong new_observation dimension #435

Closed Shafi2016 closed 3 years ago

Shafi2016 commented 3 years ago

Hello, I want to construct breakdown for feedforward neural network. However I am getting error elif new_observation.shape[0] != 1: ---> 20 raise ValueError("Wrong new_observation dimension"). I do not know how to fix it on predict_parts. The x_train[:, 0, :].shape (is (9, 75)

`x_train.shape
 x_test.shape
  #(9, 1, 75)
import dalex as dx
exp_FNN = dx.Explainer(model,  x_train[:, 0, :], y_train.ravel(), 
                  label = "FNN")

Preparation of a new explainer is initiated

-> data : numpy.ndarray converted to pandas.DataFrame. Columns are set as string numbers. -> data : 224 rows 75 cols -> target variable : 224 values -> model_class : keras.engine.sequential.Sequential (default) -> label : FNN -> predict function : <function yhat_tf_regression at 0x7efd69d59c20> will be used (default) -> predict function : Accepts pandas.DataFrame and numpy.ndarray. -> predicted values : min = 0.22, mean = 0.836, max = 0.961 -> model type : regression will be used (default) -> residual function : difference between y and yhat (default) -> residuals : min = -0.459, mean = -0.245, max = 0.368 -> model_info : package keras

A new explainer has been created!

ffn = exp_FNN.predict_parts(x_test[:,0,:], 
             type = 'break_down')` 

image

hbaniecki commented 3 years ago

Hi, predict_parts takes only one observation as an input, so in your case it should probably be shaped (1, 75)

Shafi2016 commented 3 years ago

Thanks a lot @hbaniecki it worked perfectly now. I need one more suggestion. basically, x_test is an array. I want to add the features' names as well instead of their numbers

ffn = exp_FNN.predict_parts(x_test[:1,0,:], type = 'break_down')           
ffn.result

image

hbaniecki commented 3 years ago

In the case of dalex, the cost of a unified abstraction over several ML frameworks is working with pandas (performance). The data is transformed to pd.DataFrame at initialization.

If you use Keras, it works with pandas like in https://dalex.drwhy.ai/python-dalex-tensorflow.html; moreover, if you transform the variables (e.g. categorical ones) the recommended approach would be to use Pipelines https://dalex.drwhy.ai/python-dalex-titanic.html (or create a corresponding model wrapper).

Hope this helps, for more examples look at https://dalex.drwhy.ai/python/

Shafi2016 commented 3 years ago

Thanks for sharing more detailed sources. Let me go over them.

Shafi2016 commented 3 years ago

Hello again @hbaniecki , we can add features name to shap using features_names as shap.summary_plot(shap_vals[0][:, 0, :], x_test[:, 0, :], feature_names=scaled_features, plot_type="bar")

Can we do the same using dalex or predict_parts? or what is the way to add features names?

ffn = exp_FNN.predict_parts(x_test[:1,0,:], type = 'break_down')

hbaniecki commented 3 years ago

@Shafi2016 dalex and predict_parts takes feature names from the column names in data and new_observation, which is the simplest way possible. We also created a predict_parts(..., type="shap_wrapper"), which uses the shap package and hides the feature_names parameter. I believe that an example is available at https://dalex.drwhy.ai/python-dalex-new.html.