Closed munhou closed 1 year ago
@ttyio ^ ^
@ttyio Do you know what could be the problem and what should I do?
@munhou , have you tried newer TRT, like 8.5 EA in https://docs.nvidia.com/deeplearning/tensorrt/container-release-notes/rel-22-10.html#rel-22-10 ? thanks!
closing due to no response for more than 3 weeks, please reopen if you still have question, thanks!
Description
I use pytorch-quantization train a yolov5 model,Then convert to tensorrt engine,the model works fine when batch is 1, but the accuracy of multi-batch inference drops a lot closer to 0。
And when I don't use qat to use ptq quantization, multi-batch works fine!
Environment
NGC:nvcr.io/nvidia/tensorrt:21.08-py3
TensorRT Version: TensorRT 8.0.1.6 NVIDIA GPU: 1070 NVIDIA Driver Version: 470.63.01 CUDA Version: 11.4 CUDNN Version: 8.2.2.26 Operating System: ubuntu20.4 Python Version (if applicable): Tensorflow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if so, version):
Relevant Files
qat_model
Steps To Reproduce
I use the command below to analyze the output
The output using 4 batches are difference greatly between onnx and tensorrt
And I tried the following steps, I found where the difference began, but that layer looked so common that I couldn't solve it