Closed huihui308 closed 11 months ago
When I use trtexec(TensorRT: 8.0.1.6) to generate engine file, there is a error.
$ /usr/local/TensorRT-8.6.1.6/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8 (base) david@david-ubuntu20:BGF-YOLO$ /usr/local/TensorRT-8.6.1./bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8 2023-12-04_08:50:01#(base) david@david-ubuntu20:BGF-YOLO$ /usr/local/TensorRT-8.5.3.1/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8 (base) david@david-ubuntu20:BGF-YOLO$ /usr/local/TensorRT-8.5.3.1/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8 2023-12-04_08:50:03#&&&& RUNNING TensorRT.trtexec [TensorRT v8503] # /usr/local/TensorRT-8.5.3.1/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8 2023-12-04_08:50:03#[12/04/2023-08:50:03] [W] --explicitBatch flag has been deprecated and has no effect! 2023-12-04_08:50:03#[12/04/2023-08:50:03] [W] Explicit batch dim is automatically enabled if input model is ONNX or if dynamic shapes are provided when the engine is built. 2023-12-04_08:50:03#[12/04/2023-08:50:03] [W] --workspace flag has been deprecated by --memPoolSize flag. 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === Model Options === 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Format: ONNX 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Model: best.onnx 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Output: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === Build Options === 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Max batch: explicit batch 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Memory Pools: workspace: 1024 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] minTiming: 1 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] avgTiming: 8 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Precision: FP32+FP16 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] LayerPrecisions: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Calibration: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Refit: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Sparsity: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Safe mode: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] DirectIO mode: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Restricted mode: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Build only: Enabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Save engine: best_4.engine 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Load engine: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Profiling verbosity: 0 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Tactic sources: Using default tactic sources 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] timingCacheMode: local 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] timingCacheFile: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Heuristic: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Preview Features: Use default preview flags. 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Input(s)s format: fp32:CHW 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Output(s)s format: fp32:CHW 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Input build shapes: model 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Input calibration shapes: model 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === System Options === 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Device: 0 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] DLACore: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Plugins: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === Inference Options === 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Batch: Explicit 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Input inference shapes: model 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Iterations: 10 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Duration: 3s (+ 200ms warm up) 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Sleep time: 0ms 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Idle time: 0ms 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Streams: 1 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] ExposeDMA: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Data transfers: Enabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Spin-wait: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Multithreading: Enabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] CUDA Graph: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Separate profiling: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Time Deserialize: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Time Refit: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] NVTX verbosity: 0 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Persistent Cache Ratio: 0 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Inputs: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] === Reporting Options === 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Verbose: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Averages: 10 inferences 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Percentiles: 90,95,99 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Dump refittable layers:Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Dump output: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Profile: Disabled 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Export timing to JSON file: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Export output to JSON file: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] Export profile to JSON file: 2023-12-04_08:50:03#[12/04/2023-08:50:03] [I] 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] === Device Information === 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Selected Device: NVIDIA GeForce RTX 3080 Ti 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Compute Capability: 8.6 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] SMs: 80 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Compute Clock Rate: 1.665 GHz 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Device Global Memory: 12042 MiB 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Shared Memory per SM: 100 KiB 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Memory Bus Width: 384 bits (ECC disabled) 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] Memory Clock Rate: 9.501 GHz 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] TensorRT version: 8.5.3 2023-12-04_08:50:04#[12/04/2023-08:50:04] [I] [TRT] [MemUsageChange] Init CUDA: CPU +446, GPU +0, now: CPU 459, GPU 486 (MiB) 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] Start parsing network model 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] ---------------------------------------------------------------- 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Input filename: best.onnx 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] ONNX IR version: 0.0.8 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Opset version: 17 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Producer name: pytorch 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Producer version: 2.1.0 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Domain: 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Model version: 0 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Doc string: 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] ---------------------------------------------------------------- 2023-12-04_08:50:05#[12/04/2023-08:50:05] [W] [TRT] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] No importer registered for op: Mod. Attempting to import as plugin. 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] [TRT] Searching for plugin: Mod, plugin_version: 1, plugin_namespace: 2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:769: While parsing node number 160 [Mod -> "/model.12/Mod_output_0"]: 2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:770: --- Begin node --- 2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:771: input: "/model.12/Constant_output_0" 2023-12-04_08:50:05#input: "/model.12/Constant_1_output_0" 2023-12-04_08:50:05#output: "/model.12/Mod_output_0" 2023-12-04_08:50:05#name: "/model.12/Mod" 2023-12-04_08:50:05#op_type: "Mod" 2023-12-04_08:50:05#attribute { 2023-12-04_08:50:05# name: "fmod" 2023-12-04_08:50:05# i: 0 2023-12-04_08:50:05# type: INT 2023-12-04_08:50:05#} 2023-12-04_08:50:05# 2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:772: --- End node --- 2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] [TRT] ModelImporter.cpp:775: ERROR: builtin_op_importers.cpp:4870 In function importFallbackPluginImporter: 2023-12-04_08:50:05#[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?" 2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] Failed to parse onnx file 2023-12-04_08:50:05#[12/04/2023-08:50:05] [I] Finish parsing network model 2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] Parsing model failed 2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] Failed to create engine from model or file. 2023-12-04_08:50:05#[12/04/2023-08:50:05] [E] Engine set up failed 2023-12-04_08:50:05#&&&& FAILED TensorRT.trtexec [TensorRT v8503] # /usr/local/TensorRT-8.5.3.1/bin/trtexec --onnx=best.onnx --saveEngine=best_4.engine --explicitBatch --fp16 --workspace=1024 --buildOnly --threads=8
Could you give me a hint how to solve this issue without changing tensorrt version? Because I find there is no error when I use TensorRT-8.5.3.1.
I'm not an expert on TensorRT, thus not able to answer your question.
When I use trtexec(TensorRT: 8.0.1.6) to generate engine file, there is a error.
Could you give me a hint how to solve this issue without changing tensorrt version? Because I find there is no error when I use TensorRT-8.5.3.1.