triple-Mu / YOLOv8-TensorRT

YOLOv8 using TensorRT accelerate !
MIT License
1.29k stars 223 forks source link

Cannot detect any object on Jetson NX #117

Closed codedao360 closed 1 year ago

codedao360 commented 1 year ago

I followed the steps, but no objects were detected when conducting inference with pictures, video, and cameras. Screenshot from 2023-06-19 09-24-59 System: -Jetson NX 8GB

~/temp/YOLOv8-TensorRT$ ./yolov8 yolov8s.engine data/bus.jpg
model warmup 10 times
cost 40.8140 ms
triple-Mu commented 1 year ago

How do you get engine? Do you use export-det.py?

codedao360 commented 1 year ago

Yes. I followed the guideline here: https://github.com/triple-Mu/YOLOv8-TensorRT/blob/main/docs/Jetson.md I tried to get ONNX two methods:

Of course, I used trtexec from Jetson to generate .engine files as the guideline.

triple-Mu commented 1 year ago

Yes. I followed the guideline here: https://github.com/triple-Mu/YOLOv8-TensorRT/blob/main/docs/Jetson.md I tried to get ONNX two methods:

  • from Jetson NX
  • from Ubuntu PC by using export-det.py but the two methods are both not work.

Of course, I used trtexec from Jetson to generate .engine files as the guideline.

Have you test the onnx convert engine on your pc? Suggest you testing acc on pc and then verify it on jetson.

codedao360 commented 1 year ago

I will do it and comeback to inform you.

codedao360 commented 1 year ago

I have an error when infer using infer-det.py by using PC (GPU 2060 Super) When convert ONNX OLOv8s summary (fused): 168 layers, 11156544 parameters, 0 gradients, 28.6 GFLOPs [W shape_type_inference.cpp:1920] Warning: The shape inference of TRT::EfficientNMS_TRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable) [W shape_type_inference.cpp:1920] Warning: The shape inference of TRT::EfficientNMS_TRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable) [W shape_type_inference.cpp:1920] Warning: The shape inference of TRT::EfficientNMS_TRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable) [W shape_type_inference.cpp:1920] Warning: The shape inference of TRT::EfficientNMS_TRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable) ============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 ============= verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 4 WARNING 0 ERROR ======================== 4 WARNING were not printed due to the log level.

When using tensorrt trtexec .engine is okie. 06/19/2023-20:27:16] [W] [TRT] onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped [06/19/2023-20:27:16] [I] [TRT] No importer registered for op: EfficientNMS_TRT. Attempting to import as plugin. [06/19/2023-20:27:16] [I] [TRT] Searching for plugin: EfficientNMS_TRT, plugin_version: 1, plugin_namespace: [06/19/2023-20:27:16] [W] [TRT] builtin_op_importers.cpp:5221: Attribute class_agnostic not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build. [06/19/2023-20:27:16] [I] [TRT] Successfully created plugin: EfficientNMS_TRT [06/19/2023-20:27:16] [I] Finished parsing network model. Parse time: 0.141049 [06/19/2023-20:27:16] [I] [TRT] BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32. [06/19/2023-20:27:16] [I] [TRT] Graph optimization time: 0.043449 seconds. [06/19/2023-20:27:18] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +1165, GPU +202, now: CPU 2698, GPU 985 (MiB) [06/19/2023-20:27:19] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +310, GPU +48, now: CPU 3008, GPU 1033 (MiB) [06/19/2023-20:27:19] [I] [TRT] BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32. [06/19/2023-20:27:19] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored. [06/19/2023-20:30:13] [I] [TRT] Detected 1 inputs and 4 output network tensors. [06/19/2023-20:30:14] [I] [TRT] Total Host Persistent Memory: 259936 [06/19/2023-20:30:14] [I] [TRT] Total Device Persistent Memory: 1420288 [06/19/2023-20:30:14] [I] [TRT] Total Scratch Memory: 16128768 [06/19/2023-20:30:14] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 12 MiB, GPU 261 MiB [06/19/2023-20:30:14] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 200 steps to complete. [06/19/2023-20:30:14] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 17.9408ms to assign 8 blocks to 200 nodes requiring 38248448 bytes. [06/19/2023-20:30:14] [I] [TRT] Total Activation Memory: 38247424 [06/19/2023-20:30:14] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3059, GPU 1102 (MiB) [06/19/2023-20:30:14] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +1, GPU +10, now: CPU 3060, GPU 1112 (MiB) [06/19/2023-20:30:14] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +6, GPU +52, now: CPU 6, GPU 52 (MiB) [06/19/2023-20:30:14] [I] Engine built in 192.483 sec. [06/19/2023-20:30:14] [I] [TRT] Loaded engine size: 51 MiB [06/19/2023-20:30:15] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 2178, GPU 918 (MiB) [06/19/2023-20:30:15] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 2178, GPU 926 (MiB) [06/19/2023-20:30:15] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +51, now: CPU 0, GPU 51 (MiB) [06/19/2023-20:30:15] [I] Engine deserialized in 0.246446 sec. But when I used infer-det [06/19/2023-20:33:53] [TRT] [E] 1: [stdArchiveReader.cpp::StdArchiveReader::30] Error Code 1: Serialization (Serialization assertion magicTagRead == kMAGIC_TAG failed.Magic tag does not match) [06/19/2023-20:33:53] [TRT] [E] 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.) Traceback (most recent call last): File "/home/tungpx2/temp/YOLOv8-TensorRT/infer-det.py", line 82, in <module> main(args) File "/home/tungpx2/temp/YOLOv8-TensorRT/infer-det.py", line 15, in main Engine = TRTModule(args.engine, device) File "/home/tungpx2/temp/YOLOv8-TensorRT/models/engine.py", line 218, in __init__ self.__init_engine() File "/home/tungpx2/temp/YOLOv8-TensorRT/models/engine.py", line 227, in __init_engine context = model.create_execution_context() AttributeError: 'NoneType' object has no attribute 'create_execution_context'

triple-Mu commented 1 year ago

trtexec's trt version is mismatch python tensorrt. Suggest using build.py for building engine.

triple-Mu commented 1 year ago

Do not install tensorrt by pip install tensorrt if you want to use c++ api. You can download TensorRT.xxxx.tar.gz and find package in TensorRT.xxxx/python.

triple-Mu commented 1 year ago

closed, feel free to reopen it if you have further questions. Thanks!