Closed entn-at closed 1 year ago
The warning you are receiving is because your config has a max_batch_size of 0, so it is saying that batching is unavailable.
What are the verbose logs (--log-verbose 1
) when you try setting the max_batch_size to 16. You'd need to remove the first variable dimension for all your inputs/outputs, since the batching dimension will be your first dimension. And for your [-1] input "length", you'll need to use the reshape field.
Many thanks! It was indeed a configuration issue, specifically the superfluous first variable dimension -1
. Together with reshape, it's working as expected.
Description I'm trying to run a model (Titanet-large from NeMo) converted to TensorRT in Triton. It has dynamic shapes and was converted with max. batch size of 16; however, Triton first instructed me to set
max_batch_size: 0
and now I get a warning that "The specified dimensions in model MODEL_NAME config hints that batching is unavailable".Below is the output of
polygraphy inspect model titanet_large.plan
:Below is the model config:
I exported TitaNet-Large (in NeMo) to ONNX via
model.export()
and then converted it to a TRT engine (see logs below).Triton Information I'm using Triton 22.12, using the official Triton container.
Expected behavior I expected to be able to set
max_batch_size
to 16 and use batching.