tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.6k stars 3.51k forks source link

Error when deploy tensor2tensor model after training #1906

Closed vinh869163 closed 2 years ago

vinh869163 commented 2 years ago

Description

After training:

t2t-trainer --data_dir=t2t_data --problem=translate_envi_iwslt32k \
--model=transformer --hparams_set=transformer_base --output_dir=t2t_output

and export model with:

t2t-exporter \
  --model=transformer \
  --hparams_set=transformer_base \
  --problem=translate_envi_iwslt32k \
  --data_dir=t2t_data \
  --output_dir=t2t_output

I put "saved_model.pbtxt" and folder "variables" on Tensorflow/serving docker to run and I got an error after sending this command:

curl -d '{"signature_name": "serving_default","instances": ["Hello World"]}' -X POST http://localhost:8501/v1/models/transformer:predict

the error

2021-12-08 10:40:39.298807: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
2021-12-08 10:40:39.475061: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:206] Restoring SavedModel bundle.
2021-12-08 10:40:39.520565: I external/org_tensorflow/tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 3399905000 Hz
2021-12-08 10:40:39.743046: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:190] Running initialization op on SavedModel bundle at path: /models/transformer/1
2021-12-08 10:40:39.793342: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:277] SavedModel load for tags { serve }; Status: success: OK. Took 819914 microseconds.
2021-12-08 10:40:39.805996: I tensorflow_serving/servables/tensorflow/saved_model_warmup_util.cc:59] No warmup data file found at /models/transformer/1/assets.extra/tf_serving_warmup_requests
2021-12-08 10:40:39.820258: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: transformer version: 1}
2021-12-08 10:40:39.821706: I tensorflow_serving/model_servers/server_core.cc:486] Finished adding/updating models
2021-12-08 10:40:39.821821: I tensorflow_serving/model_servers/server.cc:367] Profiler service is enabled
2021-12-08 10:40:39.826404: I tensorflow_serving/model_servers/server.cc:393] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2021-12-08 10:40:39.828296: I tensorflow_serving/model_servers/server.cc:414] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 245] NET_LOG: Entering the event loop ...
2021-12-08 10:41:03.497143: W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:367 : Invalid argument: Could not parse example input, value: 'Xin chao'
2021-12-08 10:50:09.698212: W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at example_parsing_ops.cc:367 : Invalid argument: Could not parse example input, value: 'Hello World'

Environment information

OS: Ubuntu 20.04
docker for train: bitspeech/tensor2tensor:1.9.0-gpu
docker for serving: tensorflow/serving          latest-gpu 

$ pip freeze | grep tensor
tensor2tensor==1.9.0
tensorboard==1.11.0
tensorflow==1.11.0

$ python -V
# your output here
Python 2.7.12

For bugs: reproduction and error logs

# Steps to reproduce:
...
# Error logs:
...