Yutong-gannis / ETSAuto

🚚 ETSAuto is an Advanced driver Assistance System applied in Euro Truck Simulator 2, performing the functions of Lane Centering Control (LCC) and Auto Lane Change (ALC).
MIT License
171 stars 24 forks source link

构建 CLRNet 的 TensorRT 文件时报错 #35

Closed CrazyMustard-404 closed 1 year ago

CrazyMustard-404 commented 1 year ago

已成功生成llamas_dla34_tmp.onnx文件,但在构建engine文件时出错,报错如下: [03/29/2023-16:55:56] [E] [TRT] ModelImporter.cpp:776: --- End node --- [03/29/2023-16:55:56] [E] [TRT] ModelImporter.cpp:779: ERROR: ModelImporter.cpp:180 In function parseGraph: [6] Invalid Node - Pad_237 [shuffleNode.cpp::nvinfer1::builder::ShuffleNode::symbolicExecute::392] Error Code 4: Internal Error (Reshape_226: IShuffleLayer applied to shape tensor must have 0 or 1 reshape dimensi ons: dimensions were [-1,2]) [03/29/2023-16:55:56] [E] Failed to parse onnx file [03/29/2023-16:55:56] [I] Finish parsing network model [03/29/2023-16:55:56] [E] Parsing model failed [03/29/2023-16:55:56] [E] Failed to create engine from model or file. [03/29/2023-16:55:56] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8402] # trtexec --onnx=./engines/llamas_dla34_tmp.onnx --saveEngine=./engines/llamas_dla34.engine

Yutong-gannis commented 1 year ago

这是所有报错?

CrazyMustard-404 commented 1 year ago

这是所有报错?

抱歉,只贴了一部分,以下为完整输出:

&&&& RUNNING TensorRT.trtexec [TensorRT v8402] # trtexec --onnx=./engines/llamas_dla34_tmp.onnx --saveEngine=./engines/llamas_dla34.engine [03/29/2023-16:55:55] [I] === Model Options === [03/29/2023-16:55:55] [I] Format: ONNX [03/29/2023-16:55:55] [I] Model: ./engines/llamas_dla34_tmp.onnx [03/29/2023-16:55:55] [I] Output: [03/29/2023-16:55:55] [I] === Build Options === [03/29/2023-16:55:55] [I] Max batch: explicit batch [03/29/2023-16:55:55] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default [03/29/2023-16:55:55] [I] minTiming: 1 [03/29/2023-16:55:55] [I] avgTiming: 8 [03/29/2023-16:55:55] [I] Precision: FP32 [03/29/2023-16:55:55] [I] LayerPrecisions: [03/29/2023-16:55:55] [I] Calibration: [03/29/2023-16:55:55] [I] Refit: Disabled [03/29/2023-16:55:55] [I] Sparsity: Disabled [03/29/2023-16:55:55] [I] Safe mode: Disabled [03/29/2023-16:55:55] [I] DirectIO mode: Disabled [03/29/2023-16:55:55] [I] Restricted mode: Disabled [03/29/2023-16:55:55] [I] Build only: Disabled [03/29/2023-16:55:55] [I] Save engine: ./engines/llamas_dla34.engine [03/29/2023-16:55:55] [I] Load engine: [03/29/2023-16:55:55] [I] Profiling verbosity: 0 [03/29/2023-16:55:55] [I] Tactic sources: Using default tactic sources [03/29/2023-16:55:55] [I] timingCacheMode: local [03/29/2023-16:55:55] [I] timingCacheFile: [03/29/2023-16:55:55] [I] Input(s)s format: fp32:CHW [03/29/2023-16:55:55] [I] Output(s)s format: fp32:CHW [03/29/2023-16:55:55] [I] Input build shapes: model [03/29/2023-16:55:55] [I] Input calibration shapes: model [03/29/2023-16:55:55] [I] === System Options === [03/29/2023-16:55:55] [I] Device: 0 [03/29/2023-16:55:55] [I] DLACore: [03/29/2023-16:55:55] [I] Plugins: [03/29/2023-16:55:55] [I] === Inference Options === [03/29/2023-16:55:55] [I] Batch: Explicit [03/29/2023-16:55:55] [I] Input inference shapes: model [03/29/2023-16:55:55] [I] Iterations: 10 [03/29/2023-16:55:55] [I] Duration: 3s (+ 200ms warm up) [03/29/2023-16:55:55] [I] Sleep time: 0ms [03/29/2023-16:55:55] [I] Idle time: 0ms [03/29/2023-16:55:55] [I] Streams: 1 [03/29/2023-16:55:55] [I] ExposeDMA: Disabled [03/29/2023-16:55:55] [I] Data transfers: Enabled [03/29/2023-16:55:55] [I] Spin-wait: Disabled [03/29/2023-16:55:55] [I] Multithreading: Disabled [03/29/2023-16:55:55] [I] CUDA Graph: Disabled [03/29/2023-16:55:55] [I] Separate profiling: Disabled [03/29/2023-16:55:55] [I] Time Deserialize: Disabled [03/29/2023-16:55:55] [I] Time Refit: Disabled [03/29/2023-16:55:55] [I] Inputs: [03/29/2023-16:55:55] [I] === Reporting Options === [03/29/2023-16:55:55] [I] Verbose: Disabled [03/29/2023-16:55:55] [I] Averages: 10 inferences [03/29/2023-16:55:55] [I] Percentile: 99 [03/29/2023-16:55:55] [I] Dump refittable layers:Disabled [03/29/2023-16:55:55] [I] Dump output: Disabled [03/29/2023-16:55:55] [I] Profile: Disabled [03/29/2023-16:55:55] [I] Export timing to JSON file: [03/29/2023-16:55:55] [I] Export output to JSON file: [03/29/2023-16:55:55] [I] Export profile to JSON file: [03/29/2023-16:55:55] [I] [03/29/2023-16:55:55] [I] === Device Information === [03/29/2023-16:55:55] [I] Selected Device: NVIDIA GeForce RTX 3090 [03/29/2023-16:55:55] [I] Compute Capability: 8.6 [03/29/2023-16:55:55] [I] SMs: 82 [03/29/2023-16:55:55] [I] Compute Clock Rate: 1.785 GHz [03/29/2023-16:55:55] [I] Device Global Memory: 24575 MiB [03/29/2023-16:55:55] [I] Shared Memory per SM: 100 KiB [03/29/2023-16:55:55] [I] Memory Bus Width: 384 bits (ECC disabled) [03/29/2023-16:55:55] [I] Memory Clock Rate: 9.751 GHz [03/29/2023-16:55:55] [I] [03/29/2023-16:55:55] [I] TensorRT version: 8.4.2 [03/29/2023-16:55:55] [I] [TRT] [MemUsageChange] Init CUDA: CPU +492, GPU +0, now: CPU 7429, GPU 1441 (MiB) [03/29/2023-16:55:56] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +365, GPU +104, now: CPU 7984, GPU 1545 (MiB) [03/29/2023-16:55:56] [I] Start parsing network model [03/29/2023-16:55:56] [I] [TRT] ---------------------------------------------------------------- [03/29/2023-16:55:56] [I] [TRT] Input filename: ./engines/llamas_dla34_tmp.onnx [03/29/2023-16:55:56] [I] [TRT] ONNX IR version: 0.0.6 [03/29/2023-16:55:56] [I] [TRT] Opset version: 11 [03/29/2023-16:55:56] [I] [TRT] Producer name: pytorch [03/29/2023-16:55:56] [I] [TRT] Producer version: 1.9 [03/29/2023-16:55:56] [I] [TRT] Domain: [03/29/2023-16:55:56] [I] [TRT] Model version: 0 [03/29/2023-16:55:56] [I] [TRT] Doc string: [03/29/2023-16:55:56] [I] [TRT] ---------------------------------------------------------------- [03/29/2023-16:55:56] [W] [TRT] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [03/29/2023-16:55:56] [W] [TRT] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped [03/29/2023-16:55:56] [W] [TRT] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped [03/29/2023-16:55:56] [E] Error[4]: [shuffleNode.cpp::nvinfer1::builder::ShuffleNode::symbolicExecute::392] Error Code 4: Internal Error (Reshape_226: IShuffleLayer applied to shape ten sor must have 0 or 1 reshape dimensions: dimensions were [-1,2]) [03/29/2023-16:55:56] [E] [TRT] ModelImporter.cpp:773: While parsing node number 237 [Pad -> "496"]: [03/29/2023-16:55:56] [E] [TRT] ModelImporter.cpp:774: --- Begin node --- [03/29/2023-16:55:56] [E] [TRT] ModelImporter.cpp:775: input: "313" input: "494" input: "495" output: "496" name: "Pad_237" op_type: "Pad" attribute { name: "mode" s: "constant" type: STRING }

[03/29/2023-16:55:56] [E] [TRT] ModelImporter.cpp:776: --- End node --- [03/29/2023-16:55:56] [E] [TRT] ModelImporter.cpp:779: ERROR: ModelImporter.cpp:180 In function parseGraph: [6] Invalid Node - Pad_237 [shuffleNode.cpp::nvinfer1::builder::ShuffleNode::symbolicExecute::392] Error Code 4: Internal Error (Reshape_226: IShuffleLayer applied to shape tensor must have 0 or 1 reshape dimensi ons: dimensions were [-1,2]) [03/29/2023-16:55:56] [E] Failed to parse onnx file [03/29/2023-16:55:56] [I] Finish parsing network model [03/29/2023-16:55:56] [E] Parsing model failed [03/29/2023-16:55:56] [E] Failed to create engine from model or file. [03/29/2023-16:55:56] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8402] # trtexec --onnx=./engines/llamas_dla34_tmp.onnx --saveEngine=./engines/llamas_dla34.engine

(ADAS) D:\Project\Self-driving-Truck-in-Euro-Truck-Simulator2-main>trtexec --onnx=./engines/llamas_dla34_tmp.onnx --saveEngine=./engines/llamas_dla34.engine &&&& RUNNING TensorRT.trtexec [TensorRT v8402] # trtexec --onnx=./engines/llamas_dla34_tmp.onnx --saveEngine=./engines/llamas_dla34.engine [03/29/2023-17:17:43] [I] === Model Options === [03/29/2023-17:17:43] [I] Format: ONNX [03/29/2023-17:17:43] [I] Model: ./engines/llamas_dla34_tmp.onnx [03/29/2023-17:17:43] [I] Output: [03/29/2023-17:17:43] [I] === Build Options === [03/29/2023-17:17:43] [I] Max batch: explicit batch [03/29/2023-17:17:43] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default [03/29/2023-17:17:43] [I] minTiming: 1 [03/29/2023-17:17:43] [I] avgTiming: 8 [03/29/2023-17:17:43] [I] Precision: FP32 [03/29/2023-17:17:43] [I] LayerPrecisions: [03/29/2023-17:17:43] [I] Calibration: [03/29/2023-17:17:43] [I] Refit: Disabled [03/29/2023-17:17:43] [I] Sparsity: Disabled [03/29/2023-17:17:43] [I] Safe mode: Disabled [03/29/2023-17:17:43] [I] DirectIO mode: Disabled [03/29/2023-17:17:43] [I] Restricted mode: Disabled [03/29/2023-17:17:43] [I] Build only: Disabled [03/29/2023-17:17:43] [I] Save engine: ./engines/llamas_dla34.engine [03/29/2023-17:17:43] [I] Load engine: [03/29/2023-17:17:43] [I] Profiling verbosity: 0 [03/29/2023-17:17:43] [I] Tactic sources: Using default tactic sources [03/29/2023-17:17:43] [I] timingCacheMode: local [03/29/2023-17:17:43] [I] timingCacheFile: [03/29/2023-17:17:43] [I] Input(s)s format: fp32:CHW [03/29/2023-17:17:43] [I] Output(s)s format: fp32:CHW [03/29/2023-17:17:43] [I] Input build shapes: model [03/29/2023-17:17:43] [I] Input calibration shapes: model [03/29/2023-17:17:43] [I] === System Options === [03/29/2023-17:17:43] [I] Device: 0 [03/29/2023-17:17:43] [I] DLACore: [03/29/2023-17:17:43] [I] Plugins: [03/29/2023-17:17:43] [I] === Inference Options === [03/29/2023-17:17:43] [I] Batch: Explicit [03/29/2023-17:17:43] [I] Input inference shapes: model [03/29/2023-17:17:43] [I] Iterations: 10 [03/29/2023-17:17:43] [I] Duration: 3s (+ 200ms warm up) [03/29/2023-17:17:43] [I] Sleep time: 0ms [03/29/2023-17:17:43] [I] Idle time: 0ms [03/29/2023-17:17:43] [I] Streams: 1 [03/29/2023-17:17:43] [I] ExposeDMA: Disabled [03/29/2023-17:17:43] [I] Data transfers: Enabled [03/29/2023-17:17:43] [I] Spin-wait: Disabled [03/29/2023-17:17:43] [I] Multithreading: Disabled [03/29/2023-17:17:43] [I] CUDA Graph: Disabled [03/29/2023-17:17:43] [I] Separate profiling: Disabled [03/29/2023-17:17:43] [I] Time Deserialize: Disabled [03/29/2023-17:17:43] [I] Time Refit: Disabled [03/29/2023-17:17:43] [I] Inputs: [03/29/2023-17:17:43] [I] === Reporting Options === [03/29/2023-17:17:43] [I] Verbose: Disabled [03/29/2023-17:17:43] [I] Averages: 10 inferences [03/29/2023-17:17:43] [I] Percentile: 99 [03/29/2023-17:17:43] [I] Dump refittable layers:Disabled [03/29/2023-17:17:43] [I] Dump output: Disabled [03/29/2023-17:17:43] [I] Profile: Disabled [03/29/2023-17:17:43] [I] Export timing to JSON file: [03/29/2023-17:17:43] [I] Export output to JSON file: [03/29/2023-17:17:43] [I] Export profile to JSON file: [03/29/2023-17:17:43] [I] [03/29/2023-17:17:43] [I] === Device Information === [03/29/2023-17:17:43] [I] Selected Device: NVIDIA GeForce RTX 3090 [03/29/2023-17:17:43] [I] Compute Capability: 8.6 [03/29/2023-17:17:43] [I] SMs: 82 [03/29/2023-17:17:43] [I] Compute Clock Rate: 1.785 GHz [03/29/2023-17:17:43] [I] Device Global Memory: 24575 MiB [03/29/2023-17:17:43] [I] Shared Memory per SM: 100 KiB [03/29/2023-17:17:43] [I] Memory Bus Width: 384 bits (ECC disabled) [03/29/2023-17:17:43] [I] Memory Clock Rate: 9.751 GHz [03/29/2023-17:17:43] [I] [03/29/2023-17:17:43] [I] TensorRT version: 8.4.2 [03/29/2023-17:17:43] [I] [TRT] [MemUsageChange] Init CUDA: CPU +494, GPU +0, now: CPU 7658, GPU 1441 (MiB) [03/29/2023-17:17:44] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +364, GPU +104, now: CPU 8216, GPU 1545 (MiB) [03/29/2023-17:17:44] [I] Start parsing network model [03/29/2023-17:17:44] [I] [TRT] ---------------------------------------------------------------- [03/29/2023-17:17:44] [I] [TRT] Input filename: ./engines/llamas_dla34_tmp.onnx [03/29/2023-17:17:44] [I] [TRT] ONNX IR version: 0.0.6 [03/29/2023-17:17:44] [I] [TRT] Opset version: 11 [03/29/2023-17:17:44] [I] [TRT] Producer name: pytorch [03/29/2023-17:17:44] [I] [TRT] Producer version: 1.9 [03/29/2023-17:17:44] [I] [TRT] Domain: [03/29/2023-17:17:44] [I] [TRT] Model version: 0 [03/29/2023-17:17:44] [I] [TRT] Doc string: [03/29/2023-17:17:44] [I] [TRT] ---------------------------------------------------------------- [03/29/2023-17:17:44] [W] [TRT] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [03/29/2023-17:17:44] [W] [TRT] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped [03/29/2023-17:17:44] [W] [TRT] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped [03/29/2023-17:17:44] [E] Error[4]: [shuffleNode.cpp::nvinfer1::builder::ShuffleNode::symbolicExecute::392] Error Code 4: Internal Error (Reshape_226: IShuffleLayer applied to shape ten sor must have 0 or 1 reshape dimensions: dimensions were [-1,2]) [03/29/2023-17:17:44] [E] [TRT] ModelImporter.cpp:773: While parsing node number 237 [Pad -> "496"]: [03/29/2023-17:17:44] [E] [TRT] ModelImporter.cpp:774: --- Begin node --- [03/29/2023-17:17:44] [E] [TRT] ModelImporter.cpp:775: input: "313" input: "494" input: "495" output: "496" name: "Pad_237" op_type: "Pad" attribute { name: "mode" s: "constant" type: STRING }

[03/29/2023-17:17:44] [E] [TRT] ModelImporter.cpp:776: --- End node --- [03/29/2023-17:17:44] [E] [TRT] ModelImporter.cpp:779: ERROR: ModelImporter.cpp:180 In function parseGraph: [6] Invalid Node - Pad_237 [shuffleNode.cpp::nvinfer1::builder::ShuffleNode::symbolicExecute::392] Error Code 4: Internal Error (Reshape_226: IShuffleLayer applied to shape tensor must have 0 or 1 reshape dimensi ons: dimensions were [-1,2]) [03/29/2023-17:17:44] [E] Failed to parse onnx file [03/29/2023-17:17:44] [I] Finish parsing network model [03/29/2023-17:17:44] [E] Parsing model failed [03/29/2023-17:17:44] [E] Failed to create engine from model or file. [03/29/2023-17:17:44] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8402] # trtexec --onnx=./engines/llamas_dla34_tmp.onnx --saveEngine=./engines/llamas_dla34.engine

Yutong-gannis commented 1 year ago

@CrazyMustard-404 先诊断一下onnx文件 polygraphy surgeon sanitize your_path/tusimple_r18.onnx --fold-constants --output your_path/tusimple_r18.onnx

CrazyMustard-404 commented 1 year ago

这个诊断后看起来是正常的。 [W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored [I] RUNNING | Command: D:\anaconda\envs\ADAS\Scripts\polygraphy surgeon sanitize engines/llamas_dla34.onnx --fold-constants --output output/34.onnx [I] Inferring shapes in the model with onnxruntime.tools.symbolic_shape_infer. Note: To force Polygraphy to use onnx.shape_inference instead, set allow_onnxruntime=False or use the --no-onnxruntime-shape-inference command-line option. [I] Loading model: D:\Project\Self-driving-Truck-in-Euro-Truck-Simulator2-main\engines\llamas_dla34.onnx [I] Original Model: Name: torch-jit-export | ONNX Opset: 11

---- 1 Graph Input(s) ----
{input [dtype=float32, shape=(1, 3, 320, 800)]}

---- 1 Graph Output(s) ----
{3076 [dtype=float32, shape=(1, 192, 78)]}

---- 222 Initializer(s) ----

---- 2603 Node(s) ----

[I] Folding Constants | Pass 1 [E] Module: 'onnx_graphsurgeon' version '0.3.12' is installed, but version '>=0.3.21' is required. Please install the required version or set POLYGRAPHY_AUTOINSTALL_DEPS=1 in your environment variables to allow Polygraphy to do so automatically. Attempting to continue with the currently installed version of this module, but note that this may cause errors! [W] Constant folding pass failed. Skipping subsequent passes. Note: Error was: fold_constants() got an unexpected keyword argument 'size_threshold' [I] Saving ONNX model to: output/34.onnx [I] New Model: Name: torch-jit-export | ONNX Opset: 11

---- 1 Graph Input(s) ----
{input [dtype=float32, shape=(1, 3, 320, 800)]}

---- 1 Graph Output(s) ----
{3076 [dtype=float32, shape=(1, 192, 78)]}

---- 222 Initializer(s) ----

---- 2603 Node(s) ----

[I] PASSED | Runtime: 1.856s | Command: D:\anaconda\envs\ADAS\Scripts\polygraphy surgeon sanitize engines/llamas_dla34.onnx --fold-constants --output output/34.onnx

Yutong-gannis commented 1 year ago

@CrazyMustard-404 用这个34.onnx转trt试试

CrazyMustard-404 commented 1 year ago

@CrazyMustard-404 用这个34.onnx转trt试试

试了,还是报相同错误。

Yutong-gannis commented 1 year ago

@CrazyMustard-404 可以先去掉车道线检测试试 https://github.com/Yutong-gannis/ETSAuto/blob/8f8e367b9949cbb9475063b1992a9ba7e401f0b7/script/main.py#L78-L80

CrazyMustard-404 commented 1 year ago

@CrazyMustard-404 可以先去掉车道线检测试试

https://github.com/Yutong-gannis/ETSAuto/blob/8f8e367b9949cbb9475063b1992a9ba7e401f0b7/script/main.py#L78-L80

谢谢UP,问题已解决,是CUDA及tensorrt版本的问题。最终解决版本:CUDA 11.8 、 CUDNN 8.8.0.121_cuda11、 TensorRT-8.5.3.1、 torch==1.13.1+cu117.