GoogleCloudPlatform / cloudml-samples

Cloud ML Engine repo. Please visit the new Vertex AI samples repo at https://github.com/GoogleCloudPlatform/vertex-ai-samples
https://cloud.google.com/ai-platform/docs/
Apache License 2.0
1.52k stars 857 forks source link

Failed to submit the online prediction request #468

Closed DerekGrant closed 4 years ago

DerekGrant commented 4 years ago

Describe the bug When I running the "getting-started-keras.ipynb", I met this problem that couldn't submit to prediction.

What sample is this bug related to? Getting started: training and prediction with Keras

Source code / logs When I was training, there were some warnings, I think they may lead to the problem.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/signature_def_utils_impl.py:253: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: None
INFO:tensorflow:Signatures INCLUDED in export for Train: ['train']
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
WARNING:tensorflow:Export includes no default signature!
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: None
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']
WARNING:tensorflow:Export includes no default signature!
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.momentum
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.rho
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-0.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-0.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-1.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-1.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-2.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-2.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-3.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-3.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-4.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-4.bias
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/alpha/guide/checkpoints#loading_mechanics for details.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default']
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: local-training-output/keras_export/saved_model.pb
Model exported to: local-training-output/keras_export/
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.momentum
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.rho
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-0.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-0.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-1.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-1.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-2.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-2.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-3.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-3.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-4.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'rms' for (root).layer_with_weights-4.bias
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/alpha/guide/checkpoints#loading_mechanics for details.

The original codes has a problem that it won't save a "saved_model.pb" to "/variable", so I mannually copy it.

But when I was submitting to predict and running following codes:

! gcloud ai-platform predict \
  --model $MODEL_NAME \
  --version $MODEL_VERSION \
  --json-instances prediction_input.json

There is an error:

  "error": "Prediction failed: Error during model execution: AbortionError(code=StatusCode.FAILED_PRECONDITION, details=\"Error while reading resource variable dense/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/dense/bias)\n\t [[{{node dense/BiasAdd/ReadVariableOp}}]]\")"

To Reproduce Steps to reproduce the behavior: Just run all cells, and manually copy "saved_model.pb" to "/variable" Continue run

Expected behavior Getting the prediction result

System Information Colab default configuration

DerekGrant commented 4 years ago

I found the reason for this problem that dues to the mistake of the model's export path. When exploring the model, it should be saved to:

Model exported to:  local-training-output/keras_export/1553709223

but in fact, may due to the tf version reason, it just exported to

Model exported to:  local-training-output/keras_export/

So when you run

SAVED_MODEL_PATH = KERAS_EXPORT_DIRS[-1]

SAVED_MODEL_PATH is expected as "local-training-output/keras_export/1553709223"; but in fact, you will get "local-training-output/keras_export/variables"

Hence, when you save your model, it will tell you there isn't a "saved_model.pb" in "variables". I just copy a pb file to "variables" to solve it.

But later, when you want to predict, it will load model from "variables", where doesn't have any parameters of the model, so you will fail.