NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.55k stars 2.1k forks source link

Cuda Runtime (an illegal memory access was encountered) on calibrating Yolo_v5 model for int8 precision #4071

Open NannilaJagadees opened 1 month ago

NannilaJagadees commented 1 month ago

Description

I am using this calibration script to generate the calib cache file for YoloV5 onnx model using EntropyCalibration2 method. But faced this issue after it starts calibrating for the 1st image.

image

Yolov5 model has topk and NMS nodes which has dynamic shape. How can we do calibration for such models? image

Environment

TensorRT Version: 8.6.2.3

NVIDIA GPU: Orin Nano 8 GB

CUDA Version: 12.2

CUDNN Version: 8.9.4

Operating System: Jetpack 6.0

Python Version: 3.10

PyTorch Version: 2.3.0

lix19937 commented 3 weeks ago

Note: dynamic shape need set calibration profile, and check the batch size, cuda h2d ptr, no -1 shape .

ref https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#int8-calib-dynamic-shapes

NannilaJagadees commented 2 weeks ago

Hi @lix19937

I set the calibration profile. Values with batch size 1: Min (1, 3, 640, 640) Opt (1, 3, 640, 640) Max (1, 3, 640, 640).

Note: Model input size is fixed, not dynamic. Only the NMS and following nodes are dynamic.