Closed deadeyegoodwin closed 5 years ago
The described limitation is stalling our team's migration from tf-serving
to tensorrt-inference-server
. Glad to hear the need is understood and we will be following the progress eagerly!
Thanks for taking this into consideration! This blocks me from switching to tensorrt inference server for quite a while, also makes deploying detection models such a pain. Hope it get supported soon!
hi, can TRTIS support the model with mutilple outputs??
Hi @tilaba. Yes it does support multiple outputs. For example in case of tensorflow object detection api, for your outputs in the config file you set as follows:
output [
{
name: "detection_boxes"
data_type: TYPE_FP32
dims: [ 100, 4 ]
},
{
name: "detection_scores"
data_type: TYPE_FP32
dims: [ 100 ]
},
{
name: "detection_classes"
data_type: TYPE_FP32
dims: [ 100 ]
}
]
It is an array of outputs. Later, when sending request you can use:
results = ctx.run( {input_name: input_name}, {output: InferContext.ResultFormat.RAW for output in output_names}, batch_size)
Where output_names = ["detection_boxes", "detection_scores", "detection_classes"]
it works, thanks @bezero
Have this issue been fixed?
This will be really useful for CTPN CRNN models for OCR.
The inference server now supports variable-size input and output tensor dimensions for backends that support them. As of now that is Tensorflow, Caffe2, and custom (assuming your custom backend handles them correctly). You specify such a dimension by using -1 in the model configuration for the appropriate dimension.
This support is on the master branch and will be in the 19.02 release. Please give it a try and report any issues.
@deadeyegoodwin With this feature, our team is excited to explore a migration from tf-serving
to TRTIS
. Thank you for responding to the community feedback. It is much appreciated.
@deadeyegoodwin When TRTIS container for 19.02 release will be available?
The monthly container releases are typically available around the 25th. So, following typical practice, 19.02 would be available around Monday 2/25. But this month I think it may be delayed till the end of that week.
The inference server now supports variable-size input and output tensor dimensions for backends that support them. As of now that is Tensorflow, Caffe2, and custom (assuming your custom backend handles them correctly). You specify such a dimension by using -1 in the model configuration for the appropriate dimension.
This support is on the master branch and will be in the 19.02 release. Please give it a try and report any issues.
How come TRTIS supports dynamic input size while TensorRT itself doesn't?
@ziyuang TRTIS does not support TensorRT models alone. It also supports other frameworks as well (tensorrt_plan, tensorflow_graphdef, tensorflow_savedmodel, caffe2_netdef, or custom). These platforms do support dynamic input size. To sum up, TRTIS allows you to specify dynamic input size for your models that are able to handle such inputs.
@ziyuang TRTIS does not support TensorRT models alone. It also supports other frameworks as well (tensorrt_plan, tensorflow_graphdef, tensorflow_savedmodel, caffe2_netdef, or custom). These platforms do support dynamic input size. To sum up, TRTIS allows you to specify dynamic input size for your models that are able to handle such inputs.
Good; would I have the computation graph optimized if I use models other than TensorRT PLAN?
Each backend/framework (tensorrt, tensorflow, caffe2) has its own optimization techniques that it applies to the model before execution. Typically, the optimizations performed by tensorrt provide significant speedups relative to other frameworks. But tensorflow does have some optimization as well. There is also the TensorRT-Tensorflow integration that allows you to get many of the benefits of tensorrt while still using tensorflow. TRTIS fully supports tensorflow models that have been optimized with tensorrt. https://github.com/tensorflow/tensorrt
Currently TRTIS only allows the first dimension of an input/output tensor to be variable sized and only when that dimension represents batching. TRTIS should allow variable-sized dimensions in other cases since these are supported by some of the FWs (e.g. TensorFlow) and not having it limits which models can easily run on TRTIS.