Slow Model Loading for Compiled Models with YOLO layers on Orin NX

DeepStream Version: 6.2-triton DeepStream-Yolo Version: Latest Jetpack Version: 5.1 L4T 35.2.1 ONNX Opset: 12

I encountered a problem with model loading time when using compiled models with YOLO layers. When I initialize my pipeline, there's a significant delay (~1 minute) before it starts processing the data. This latency is only observable on the Orin NX architecture. When tested on the Xavier NX, the models load almost instantly.

Key Observations:

Models Tested: YOLOX, YOLOv8, and the default trafficacm net model provided by NVIDIA. Consistent Behavior: The default model starts instantly while YOLO models exhibit a delay. Docker Container: Running the model inside a container shows interesting behaviour. The first initialization takes a significant amount of time. However, if I kill and start the container again, the models load quickly, similar to the first start on Xavier NX.

While the overall performance and accuracy are satisfactory, I'm keen to understand the root cause of this initial delay and look for suggestions on debugging and pinpointing which element/component is causing this issue.

Are there any known optimizations or settings specific to the Orin NX that I might need to include?

Are there tools or logs that can be used to debug this delay?

marcoslucianops / DeepStream-Yolo

Slow Model Loading for Compiled Models with YOLO layers on Orin NX #479