Closed banasraf closed 1 year ago
CI MESSAGE: [6985670]: BUILD STARTED
CI MESSAGE: [6985670]: BUILD FAILED
CI MESSAGE: [6985798]: BUILD STARTED
CI MESSAGE: [6985798]: BUILD FAILED
CI MESSAGE: [7063829]: BUILD STARTED
CI MESSAGE: [7063829]: BUILD FAILED
CI MESSAGE: [7092071]: BUILD STARTED
CI MESSAGE: [7092071]: BUILD FAILED
CI MESSAGE: [7092071]: BUILD PASSED
CI MESSAGE: [7113279]: BUILD STARTED
CI MESSAGE: [7113279]: BUILD PASSED
This PR enables unbatched models support in DALI backend and creates a special execution path for such models.
Unbatched model is a model with max_batch_size set to 0. In such case Triton does not interpret the first dimension of tensors as batch_size, which e.g. disables the dynamic batching.
This mode is best fitting for our streamed video use-case because we in that case we always want to handle requests one-by-one.
This required changes in the config validation/autofill (config_tools/*) and separate, simplified ExecuteUnbatched method in DaliModelInstance.
Signed-off-by: Rafal rbanas@nvidia.com