NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.79k stars 2.13k forks source link

[TRT] 1: [calibrator.cpp::add::793] Error Code 1: Cuda Runtime (an illegal memory access was encountered) #4053

Closed Egorundel closed 3 months ago

Egorundel commented 3 months ago

Description

Hello!

I have written code here for INT8 calibration of the ONNX model and further creation of the TensorRT Engine.

my repo: https://github.com/Egorundel/int8_calibrator_cpp

However, when I start calibration, I get errors:

[08/06/2024-10:15:50] [W] [TRT] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[08/06/2024-10:15:50] [W] [TRT] onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped
[08/06/2024-10:15:50] [W] [TRT] builtin_op_importers.cpp:5221: Attribute class_agnostic not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[08/06/2024-10:15:51] [W] [TRT] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.0.4
[08/06/2024-10:15:51] [W] [TRT] Calibration Profile is not defined. Calibrating with Profile 0
[08/06/2024-10:15:55] [W] [TRT] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.0.4
[08/06/2024-10:15:55] [W] [TRT] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.0.4
[08/06/2024-10:15:55] [E] [TRT] 1: [calibrator.cpp::add::793] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [executionContext.cpp::commonEmitDebugTensor::1855] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 3: [engine.cpp::~Engine::298] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/engine.cpp::~Engine::298, condition: mExecutionContextCounter.use_count() == 1. Destroying an engine object before destroying the IExecutionContext objects it created leads to undefined behavior.
)
[08/06/2024-10:15:55] [E] [TRT] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 1: [cudaResources.cpp::~ScopedCudaStream::47] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/06/2024-10:15:55] [E] [TRT] 2: [calibrator.cpp::calibrateEngine::1181] Error Code 2: Internal Error (Assertion context->executeV2(&bindings[0]) failed. )
Segmentation fault

Screenshot of errors:

Screenshot from 2024-08-06 11-13-38

Please help me solve this problem. What am I doing wrong in the code?

Let's make a practically universal TensorRT Engine calibration and creation tool and help other people together!

Environment

TensorRT Version: 8.6.1.6 NVIDIA GPU: RTX 3060 NVIDIA Driver Version: 555.42.02 CUDA Version: 11.1 CUDNN Version: 8.0.6

Operating System:

Python Version (if applicable): 3.8 PyTorch Version (if applicable): 1.10.1

Steps To Reproduce

CMake, build and launch my code in C++ IDE.

lix19937 commented 3 months ago

Ple check calib code, like batch size, h2d mem, use cuda side ptr, calib profile, etc.

ttyio commented 3 months ago

@Egorundel have you tried call setCalibrationProfile before calibration since your model has dynamic shape?

Egorundel commented 3 months ago

I reworked my code (C++) and now it works correctly. I used nvinfer1::IInt8EntropyCalibrator2.

https://github.com/Egorundel/int8_calibrator_cpp

You can take it and use it, and also integrate my solution into any of yours in C++.