Closed vilsonrodrigues closed 5 days ago
You need provide an optimization profile for dynamic shape engine, like min, opt, max shape, ref https://github.com/lix19937/trt-samples-for-hackathon-cn/blob/master/cookbook/02-API/OptimizationProfile/main-TensorInput.py#L29-L31
Hi, I do it
I create a colab to reproduce my steps. Any help is appreciated. Thank you.
https://colab.research.google.com/drive/1G-l-THRzCCqS41A5OrIEc4X7x1oOu7eB?usp=sharing
the conclusion is that for tensors of different shapes it remains to deallocate the old one and allocate the new one with a different shape.
The notebook was clear and help me. Thanks Iix19937.
Description
I am working with TensorRT v10 to do inference with dynamic batches.
My model is ViT base obtained based-on (https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/onnx_ptq)
The model was exported with dynamic axes to batch dimension. In build step a profile was setep.
To do an inference first alloc memory using the max shape (32, 3, 224, 224) to input and (32, 1000) to output.
After copy input data to device memory using device_ptr.
Set input shape in context.
Call do_inference function. But the output is always (32000) to any batch dim.
Environment
TensorRT Version: 10.0.1
NVIDIA GPU: Tesla T4
NVIDIA Driver Version: 530
CUDA Version: 12.2
CUDNN Version: 9.2.0
Operating System:
Python Version (if applicable): 3.10
PyTorch Version (if applicable): 2.3.1
Relevant Files
Model link: https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/onnx_ptq
Steps To Reproduce
Commands or scripts: