marcoslucianops / DeepStream-Yolo

NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models
MIT License
1.38k stars 343 forks source link

Slow Model Loading for Compiled Models with YOLO layers on Orin NX #479

Closed g0lemXIV closed 5 months ago

g0lemXIV commented 8 months ago

DeepStream Version: 6.2-triton DeepStream-Yolo Version: Latest Jetpack Version: 5.1 L4T 35.2.1 ONNX Opset: 12

I encountered a problem with model loading time when using compiled models with YOLO layers. When I initialize my pipeline, there's a significant delay (~1 minute) before it starts processing the data. This latency is only observable on the Orin NX architecture. When tested on the Xavier NX, the models load almost instantly.

Key Observations:

Models Tested: YOLOX, YOLOv8, and the default trafficacm net model provided by NVIDIA. Consistent Behavior: The default model starts instantly while YOLO models exhibit a delay. Docker Container: Running the model inside a container shows interesting behaviour. The first initialization takes a significant amount of time. However, if I kill and start the container again, the models load quickly, similar to the first start on Xavier NX.

While the overall performance and accuracy are satisfactory, I'm keen to understand the root cause of this initial delay and look for suggestions on debugging and pinpointing which element/component is causing this issue.

Are there any known optimizations or settings specific to the Orin NX that I might need to include?

Are there tools or logs that can be used to debug this delay?

g0lemXIV commented 5 months ago

I can close the issue. We found the problem.

We set the scaling-compute-hw parameter to 1, which caused loading problems.