PAIR-code / what-if-tool

Source code/webpage/demos for the What-If Tool
https://pair-code.github.io/what-if-tool
Apache License 2.0
916 stars 170 forks source link

'gcloud beta ai-platform explain' giving errors with 3d input array for LSTM model #65

Open tanmayg opened 4 years ago

tanmayg commented 4 years ago

Following post is not exactly en error with WIT, but I'm having issues with the output from google explain which acts as input for WIT tool. Please help, if possible.

I have a 3d input keras model which trains successfully.

The model summary:

Model: "model" Layer (type) Output Shape Param #

input_1 (InputLayer) [(None, 5, 1815)] 0
bidirectional (Bidirectional (None, 5, 64) 473088
bidirectional_1 (Bidirection (None, 5, 64) 24832
output (TimeDistributed) (None, 5, 25) 1625
Total params: 499,545 Trainable params: 499,545 Non-trainable params: 0

Post that estimator is defined and the serving is created as:

Convert our Keras model to an estimator

keras_estimator = tf.keras.estimator.model_to_estimator(keras_model=model, model_dir='export')

We need this serving input function to export our model in the next cell

serving_fn = tf.estimator.export.build_raw_serving_input_receiver_fn( {'input_1': model.input} )

export the model to bucket

export_path = keras_estimator.export_saved_model( 'gs://' + BUCKET_NAME + '/explanations', serving_input_receiver_fn=serving_fn ).decode('utf-8') print(export_path)

The explanation metadata definition is defined and copied to required destination as below: explanation_metadata = { "inputs": { "data": { "input_tensor_name": "input_1:0", "input_baselines": [np.mean(data_X, axis=0).tolist()], "encoding": "bag_of_features", "index_feature_mapping": feature_X.tolist() } }, "outputs": { "duration": { "output_tensor_name": "output/Reshape_1:0" } }, "framework": "tensorflow" }

Write the json to a local file

with open('explanation_metadata.json', 'w') as output_file: json.dump(explanation_metadata, output_file) !gsutil cp explanation_metadata.json $export_path

Post that the model is created and the version is defined as:

Create the model if it doesn't exist yet (you only need to run this once)

!gcloud ai-platform models create $MODEL --enable-logging --regions=us-central1

Create the version with gcloud

explain_method = 'integrated-gradients' !gcloud beta ai-platform versions create $VERSION \ --model $MODEL \ --origin $export_path \ --runtime-version 1.15 \ --framework TENSORFLOW \ --python-version 3.7 \ --machine-type n1-standard-4 \ --explanation-method $explain_method \ --num-integral-steps 25

Everything works fine until this step, but now when I create and send the explain request as:

prediction_json = {'input_1': data_X[:5].tolist()} with open('diag-data.json', 'w') as outfile: json.dump(prediction_json, outfile)

Send the request to google cloud

!gcloud beta ai-platform explain --model $MODEL --json-instances='diag-data.json'

I get the following error

{ "error": "Explainability failed with exception: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INVALID_ARGUMENT\n\tdetails = \"transpose expects a vector of size 4. But input(1) is a vector of size 3\n\t [[{{node bidirectional/forward_lstm_1/transpose}}]]\"\n\tdebug_error_string = \"{\"created\":\"@1586068796.692241013\",\"description\":\"Error received from peer ipv4:10.7.252.78:8500\",\"file\":\"src/core/lib/surface/call.cc\",\"file_line\":1056,\"grpc_message\":\"transpose expects a vector of size 4. But input(1) is a vector of size 3\n\t [[{{node bidirectional/forward_lstm_1/transpose}}]]\",\"grpc_status\":3}\"\n>" }

I tried altering the input shape, but nothing worked. Then to verify the format I tried with google cloud predict command which initially did not work, but worked after reshaping the input as:

prediction_json = {'input_1': data_X[:5].reshape(-1,1815).tolist()} with open('diag-data.json', 'w') as outfile: json.dump(prediction_json, outfile)

send the predict request

!gcloud beta ai-platform predict --model $MODEL --json-instances='diag-data.json'

I'm at a dead end now with !gcloud beta ai-platform explain --model $MODEL --json-instances='diag-data.json' and looking for the much needed help from SO community.

Also, for ease of experimenting, the notebook could be accessed from google_explain_notebook

jameswex commented 4 years ago

@sararob any pointers?

sararob commented 4 years ago

Thanks for the details @tanmayg. Could you tell us what your model's SigDef looks like? You can get that by running:

!saved_model_cli show --dir gs://path/to/savedmodel --all

tanmayg commented 4 years ago

Thanks for replying @sararob . Please find the output as below:

!saved_model_cli show --dir $export_path --all

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input_1'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 5, 1815)
        name: input_1:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['output'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 5, 25)
        name: output/Reshape_1:0
  Method name is: tensorflow/serving/predict
sararob commented 4 years ago

It looks like you need to make sure your baseline (defined in explanation_metadata) is the same shape as the input your model is expecting, which you can see in the shape key for your inputs in the SigDef.

Another note that the explanation service currently doesn't explain sequenced output, it'll just return explanations for the argmax value from your sequence output. I'll pass this feedback along to the team.