triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.25k stars 1.47k forks source link

Auto-Generated Model Configuration for tensorflow_savedmodel issue #2998

Closed scamianbas closed 3 years ago

scamianbas commented 3 years ago

Description The doc here https://github.com/triton-inference-server/server/blob/r21.05/docs/model_configuration.md claims that "... if Triton is started with the --strict-model-config=false option, then in some cases the required portions of the model configuration file can be generated automatically by Triton." and "Specifically, TensorRT, TensorFlow saved-model, and ONNX models do not require a model configuration file because Triton can derive all the required settings automatically. All other model types must provide a model configuration file."

In model_repository I have a model named "tf_serving" server/docs/examples/model_repository/tf_serving/ └── 1 └── model.savedmodel ├── saved_model.json ├── saved_model.pb └── variables ├── variables.data-00000-of-00001 └── variables.index without a config.pbtxt file because according to the doc I don't need one.

Then I run a triton server container 21.05 with --strict-model-config=false

docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /home/mirko/server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:21.05-py3 tritonserver --model-repository=/models --strict-model-config=false

and the container server exits with this log information

| Model | Version | Status | | densenet_onnx | 1 | READY | fasterrcnn_resnet50_fpn_pytorch | 2 | READY | inception_graphdef | 1 | READY | simple | 1 | READY | | simple_dyna_sequence | 1 | READY | simple_identity | 1 | READY | simple_int8 | 1 | READY | | simple_sequence | 1 | READY | simple_string | 1 | READY | tf_serving | 1 | UNAVAILABLE: Invalid argument: model input cannot have empty reshape for non-batching model for tf_serving |

Triton Information 21.05 container

To Reproduce See above

Expected behavior According to documentation when using --strict-model-config=false a model configuration should have been generated for a model and the container server should not exit.

tanmayv25 commented 3 years ago

@scamianbas If you run the server in verbose mode, you should be able to see the model config generated. Can you share the entire server logs with --log-verbose=1 option set?

scamianbas commented 3 years ago

@scamianbas If you run the server in verbose mode, you should be able to see the model config generated. Can you share the entire server logs with --log-verbose=1 option set?

@tanmayv25 Look like it only created that: I0610 11:49:47.743335 1 autofill.cc:138] TensorFlow SavedModel autofill: OK: I0610 11:49:47.743349 1 model_config_utils.cc:637] autofilled config: name: "tf_serving" platform: "tensorflow_savedmodel" backend: "tensorflow"

Full log is here attached. log.txt

tanmayv25 commented 3 years ago
└── 1
       └── model.savedmodel
       ├── saved_model.json
       ├── saved_model.pb
       └── variables
                    ├── variables.data-00000-of-00001
                    └── variables.index

Just to verify.. The model files of the savedmodel are within version directory i.e. 1. Right? Can you also share the output from saved_model_cli with your model?
saved_model_cli show --dir /tmp/saved_model_dir --all More details here.

scamianbas commented 3 years ago
└── 1
       └── model.savedmodel
       ├── saved_model.json
       ├── saved_model.pb
       └── variables
                    ├── variables.data-00000-of-00001
                    └── variables.index

Just to verify.. The model files of the savedmodel are within version directory i.e. 1. Right? Can you also share the output from saved_model_cli with your model? saved_model_cli show --dir /tmp/saved_model_dir --all More details here.

@tanmayv25 This tree more accurate:

image

Here's the saved_model_cli output:

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['predict']: The given SavedModel SignatureDef contains the following input(s): inputs['input_image'] tensor_info: dtype: DT_STRING shape: () name: input_bytes:0 The given SavedModel SignatureDef contains the following output(s): outputs['boxes'] tensor_info: dtype: DT_FLOAT shape: (-1, 300, 4) name: retinanet-bbox/filtered_detections/map/TensorArrayStack/TensorArrayGatherV3:0 outputs['labels'] tensor_info: dtype: DT_INT32 shape: (-1, 300) name: retinanet-bbox/filtered_detections/map/TensorArrayStack_2/TensorArrayGatherV3:0 outputs['scores'] tensor_info: dtype: DT_FLOAT shape: (-1, 300) name: retinanet-bbox/filtered_detections/map/TensorArrayStack_1/TensorArrayGatherV3:0 Method name is: tensorflow/serving/predict

tanmayv25 commented 3 years ago

The given SavedModel SignatureDef contains the following input(s): inputs['input_image'] tensor_info: dtype: DT_STRING shape: () name: input_bytes:0

Unfortunately, Triton's autofiller does not handle the cases having an empty shaped tensor very well. We have the reshape functionality that can allow user to run such models as described hare:https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#reshape.

However, the autofiller can not be made intelligent enough to handle such models and generate config for them. With an empty shape, it is difficult to derive whether the model supports batching or not.

If you can modify the model then my advice would be to use tensors with explicit shapes. input_bytes:0 can be (-1) or (-1,1). Otherwise, I would suggest writing your own model config.

We can improve the error message in case of auto-filling such models and also our documentation to avoid confusion. I will file an issue for the same with our dev team.

scamianbas commented 3 years ago

The given SavedModel SignatureDef contains the following input(s): inputs['input_image'] tensor_info: dtype: DT_STRING shape: () name: input_bytes:0

Unfortunately, Triton's autofiller does not handle the cases having an empty shaped tensor very well. We have the reshape functionality that can allow user to run such models as described hare:https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#reshape.

However, the autofiller can not be made intelligent enough to handle such models and generate config for them. With an empty shape, it is difficult to derive whether the model supports batching or not.

If you can modify the model then my advice would be to use tensors with explicit shapes. input_bytes:0 can be (-1) or (-1,1). Otherwise, I would suggest writing your own model config.

We can improve the error message in case of auto-filling such models and also our documentation to avoid confusion. I will file an issue for the same with our dev team.

@tanmayv25 Can you suggest how to write this kind of input ? I think I'm gonna try to write and use my own model config file ...

tanmayv25 commented 3 years ago

I can make only a guess.. But this should probably work:

name: "tf_serving"
platform: "tensorflow_graphdef"
max_batch_size: 0
input {
  name: "input_image"
  data_type: TYPE_STRING
  dims: [ 1 ]
  reshape: { shape: [ ] }
}
output {
  name: "boxes"
  data_type: TYPE_FP32
  dims: [-1, 300, 4]
}
output {
  name: "labels"
  data_type: TYPE_INT32
  dims: [-1, 300]
}
output {
  name: "scores"
  data_type: TYPE_FP32
  dims: [-1, 300 ]
}
backend: "tensorflow"
scamianbas commented 3 years ago

I can make only a guess.. But this should probably work:

name: "tf_serving"
platform: "tensorflow_graphdef"
max_batch_size: 0
input {
  name: "input_image"
  data_type: TYPE_STRING
  dims: [ 1 ]
  reshape: { shape: [ ] }
}
output {
  name: "boxes"
  data_type: TYPE_FP32
  dims: [-1, 300, 4]
}
output {
  name: "labels"
  data_type: TYPE_INT32
  dims: [-1, 300]
}
output {
  name: "scores"
  data_type: TYPE_FP32
  dims: [-1, 300 ]
}
backend: "tensorflow"

@tanmayv25 I just changed in your conf "tensorflow_graphdef" with "tensorflow_savedmodel" and then made a first try and I got this error : "E0610 15:57:13.145650 1 model_repository_manager.cc:1916] Poll failed for model directory 'tf_serving': model input cannot have empty reshape for non-batching model for tf_serving"

Then I changed in input "dims[ 1 ]" with "dims[ -1 ]" and "reshape: { shape: [ ] }" with "reshape: { shape: [ -1 ] }" and the model loaded ! (see below) image

Here's the full config file:

name: "tf_serving" platform: "tensorflow_savedmodel" max_batch_size: 0 input { name: "input_image" data_type: TYPE_STRING dims: [ -1 ] reshape: { shape: [ -1 ] } } output { name: "boxes" data_type: TYPE_FP32 dims: [-1, 300, 4] } output { name: "labels" data_type: TYPE_INT32 dims: [-1, 300] } output { name: "scores" data_type: TYPE_FP32 dims: [-1, 300 ] } backend: "tensorflow"

Now I still have to check if it talks correctly with a client ...

scamianbas commented 3 years ago

Unfortunately this tensorflow_savedmodel model is waiting for scalars and Triton cannot send it ...

Here's the output of my python client script: inference failed: contents must be scalar, got shape [1] [[{{node DecodeJpeg}}]]

I think you can close this issue. Thanks a lot !

tanmayv25 commented 3 years ago

Looks like you would need to modify the model to take input_image as [1] ( or [-1,1] if you want to use dynamic batching). See this related issue.

We will let this issue be open till we add scalar support in Triton, or better error logging in case of detecting scalars.

kommons commented 3 years ago

Closing the issue, please let us now if the work around does not work or reopen the issue.