triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.4k stars 1.49k forks source link

How to handle "output config" when "empty tensor" is the output of the detection model ? #5351

Closed seoungbae-park closed 1 year ago

seoungbae-park commented 1 year ago

Description I'm using detection model and returns boxes, scores, classes , masks. If model returns empty tensor because model found nothing to detect error happens

tritonclient.utils.InferenceServerException: [request id: <id_unknown>] failed to split the output tensor 'boxes' in responses: expected batch size of atleast 1 in model output, got 0

Triton Information What version of Triton are you using? "latest"

Are you using the Triton container or did you build it yourself? Using triton container To Reproduce Steps to reproduce the behavior.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

name: "boxinst"
platform: "pytorch_libtorch"
max_batch_size : 1
input [
  {
    name: "images"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [3, 512, 512]
  },
  {
    name: "pre_nms_thresh"
    data_type: TYPE_FP32
    dims: [1]
  },
{
    name: "mask_threshold"
    data_type: TYPE_FP32
    dims: [1]
  }
]
output [
  {
    name: "boxes"
    data_type: TYPE_FP32
    dims: [-1, -1]
  },
  {
    name: "scores"
    data_type: TYPE_FP32
    dims: [-1, -1]
  },
  {
    name: "class"
    data_type: TYPE_INT64                                  
    dims: [-1, -1]
    reshape: { shape: [ -1, -1] }
  },
  {
    name: "mask"
    data_type: TYPE_INT64
    dims: [-1, -1, -1]
    reshape: { shape: [ -1,  -1, -1] }
  },

Expected behavior A clear and concise description of what you expected to happen.

dyastremsky commented 1 year ago

It looks like this is because your model does not support batching. Triton is expecting the first dimension to be the batch size, since max_batch_size was set to 1.

Try setting your max_batch_size to 0 and see if it works.

seoungbae-park commented 1 year ago

In detection or segmentation general output will be boxes, classes, scores, masks as output and return as tuple[tensor]. Does trion does not allow allow these output type? @dyastremsky

dyastremsky commented 1 year ago

The datatypes are backend-specific. Datatypes listed are here, with you being able to write backends to support other datatypes.

As far as what datatypes are needed to deploy a model, it really depends on the model. We don't have a "box" type or a "mask" type, but many users have detection and segmentation models. Example walkthrough for those types of models in this blog post. For most use cases, you can try searching... there will likely be official and non-official blog posts, as well as examples and issues in this blog post that can be good references. Depending on what you're trying to do, the qa directory in the server has our tests and could also be a good reference.