NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.84k stars 2.14k forks source link

Fail when I try to export onnx model to int8 engine #2397

Closed Meize0729 closed 1 year ago

Meize0729 commented 2 years ago

[10/17/2022-21:53:09] [I] TensorRT version: 8.4.1 [10/17/2022-21:53:10] [I] [TRT] [MemUsageChange] Init CUDA: CPU +330, GPU +0, now: CPU 338, GPU 443 (MiB) [10/17/2022-21:53:20] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +79, GPU +0, now: CPU 434, GPU 443 (MiB) [10/17/2022-21:53:20] [I] Start parsing network model [10/17/2022-21:53:20] [I] [TRT] ---------------------------------------------------------------- [10/17/2022-21:53:20] [I] [TRT] Input filename: yolov5s.onnx [10/17/2022-21:53:20] [I] [TRT] ONNX IR version: 0.0.7 [10/17/2022-21:53:20] [I] [TRT] Opset version: 13 [10/17/2022-21:53:20] [I] [TRT] Producer name: pytorch [10/17/2022-21:53:20] [I] [TRT] Producer version: 1.12.1 [10/17/2022-21:53:20] [I] [TRT] Domain:
[10/17/2022-21:53:20] [I] [TRT] Model version: 0 [10/17/2022-21:53:20] [I] [TRT] Doc string:
[10/17/2022-21:53:20] [I] [TRT] ---------------------------------------------------------------- [10/17/2022-21:53:20] [W] [TRT] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [10/17/2022-21:53:20] [I] Finish parsing network model [10/17/2022-21:53:20] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best [10/17/2022-21:53:20] [W] [TRT] Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32 or Bool. [10/17/2022-21:53:23] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +648, GPU +272, now: CPU 1112, GPU 715 (MiB) [10/17/2022-21:53:25] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +175, GPU +266, now: CPU 1287, GPU 981 (MiB) [10/17/2022-21:53:25] [W] [TRT] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.2.0 [10/17/2022-21:53:25] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored. [ ERROR: CUDA Runtime ] invalid configuration argument [10/17/2022-21:53:41] [E] Error[1]: [caskBuilderUtils.h::transform::204] Error Code 1: Cask (CASK Transform Weights Failed) [10/17/2022-21:53:41] [E] Error[2]: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. ) [10/17/2022-21:53:41] [E] Engine could not be created from network [10/17/2022-21:53:41] [E] Building engine failed [10/17/2022-21:53:41] [E] Failed to create engine from model or file. [10/17/2022-21:53:41] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8401] # /home/lh/APP/TensorRT-8.4.1.5/bin/trtexec --onnx=yolov5s.onnx --saveEngine=yolov5s.engine --device=3 --int8

Of course , I use same way to export a int8 engine in windows, it can success. Meanwhile, I use engine to inference, it can not get the right result. But I use onnx to inference, it can get right result. I think this is a problem of tensorrt How can I solve this problem... Thank you

Meanwhile,I see another problem [10/17/2022-23:44:21] [TRT] [W] Unable to determine GPU memory usage [10/17/2022-23:44:21] [TRT] [W] Unable to determine GPU memory usage [10/17/2022-23:44:21] [TRT] [W] CUDA initialization failure with error: 2. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html Traceback (most recent call last): File "/home/lh/APP/TensorRT-8.4.1.5/samples/python/network_api_pytorch_mnist/sample.py", line 153, in main() File "/home/lh/APP/TensorRT-8.4.1.5/samples/python/network_api_pytorch_mnist/sample.py", line 136, in main engine = build_engine(weights) File "/home/lh/APP/TensorRT-8.4.1.5/samples/python/network_api_pytorch_mnist/sample.py", line 107, in build_engine builder = trt.Builder(TRT_LOGGER) TypeError: pybind11::init(): factory function returned nullptr

zerollzeng commented 2 years ago
[ ERROR: CUDA Runtime ] invalid configuration argument
[10/17/2022-21:53:41] [E] Error[1]: [caskBuilderUtils.h::transform::204] Error Code 1: Cask (CASK Transform Weights Failed)

Can you try our official docker images?

ttyio commented 1 year ago

closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!