triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.07k stars 1.45k forks source link

CPU triton server #3223

Closed maksna closed 3 years ago

maksna commented 3 years ago

Description I use triton server to predict tebsorflow2 albert. The server is running sucessfully,but i post a grpc request,report a BUG

[StatusCode.INTERNAL] [Derived]{{function_node __inference_albert_fine_tune_layer_call_and_return_conditional_losses_9401}} {{function_node __inference_albert_fine_tune_layer_call_and_return_conditional_losses_9401}} The CPU implementation of FusedBatchNorm only supports NHWC tensor format for now. [[{{node layer_normalization/FusedBatchNormV3}}]] [[StatefulPartitionedCall/StatefulPartitionedCall/albert_fine_tune/StatefulPartitionedCall/StatefulPartitionedCall]]

Are you using the Triton container or did you build it yourself? container: nvcr.io/nvidia/tritonserver 21.07-py3

To Reproduce Steps to reproduce the behavior. 1.docker pull nvcr.io/nvidia/tritonserver:21.07-py3 2.produce config.pbtxt and tf.savedmodel files 3.docker run -it -v /path/to/models:/models -p 8000:8000 -p 8001:8001 -p 8002:8002 nvcr.io/nvidia/tritonserver:21.07-py3 bash 4 tritonserver --model-repository=/models 5.post a grpc request

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well). models: tensorflow 2.3 albert_tiny.

my config.pbtxt platform: "tensorflow" max_batch_size: 8 input [ { name: "inputs" data_type: DT_INT8 dims: [ 2,256 ] } ] output [ { name: "outputs" data_type: DT_FLOAT dims: [ 256,205 ] } ]

Expected behavior I want know how to fix this bug . My model use triton-server gpus is normal. But when i use CPU , post request get errors.

CoderHam commented 3 years ago

@maksna looks like the limitation is the lack of support for a specific op / format by Tensorflow. Can you try to run the same in CPU only Tensorflow? You should see the same error.

dyastremsky commented 3 years ago

Closed due to inactivity. Will reopen if you continue to experience the error.