marcoslucianops / DeepStream-Yolo

NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models
MIT License
1.38k stars 343 forks source link

Performance drop when using multiple sources #530

Closed flmello closed 2 months ago

flmello commented 2 months ago

• Hardware Platform (Jetson / GPU) Jetson Xavier AGX • DeepStream Version 6.3.0 • JetPack Version (valid for Jetson only) 5.1 • TensorRT Version 8.5.2.2 • Model yolov5s.pt exported with --dynamic to onnx

I know AGX Xavier supports 52x 1080p30 (H.265), and you can even use deepstream-test3-app with several sources that you hardly see FPS being dropped (stopped testing with 22 sources). I have been trying to make several changes to my script in order to reach the 24.xxxfps, with no success. Now, I am using the deepstream_test_3.py from python bindings repository. The pipeline is minimal streammux -> queue -> pgie, see deepstream_test_3.py.txt. I call it by doing: python3 deepstream_test_3.py --no-display -i rtsp://admin:pass@10.21.45.19:554 rtsp://admin:pass@10.21.45.19:554 rtsp://admin:pass@10.21.45.19:554

My source is an RTSP stream 1920x1080@25fps. When running it with the original config file dstest3_pgie_config.txt, which loads an Resnet10 model, I get what I expected, about 24.xxxfps to all 3 sources: **PERF: {'stream0': 24.97, 'stream1': 24.97, 'stream2': 24.97}

But, when using the config_infer_primary_yoloV5.txt I get: **PERF: {'stream0': 15.99, 'stream1': 15.99, 'stream2': 15.99}

I know Yolov5 is much more complex and has much more layers than Resnet10, and thus the former is slower than the latter. However, as far as I know, my hardware can easily support 24.xxxfps for Yolo. So, I am not being able to find out a solution for this performance drop. I don't know if there is something wrong with the config file, or something incorrect with the export. Does any one has a tip?

marcoslucianops commented 2 months ago

Try to set the board to MAXN model

sudo nvpmodel -m 0

marcoslucianops commented 2 months ago

https://github.com/marcoslucianops/DeepStream-Yolo/issues/450#issuecomment-1712267149

Remember to set it on the config_infer_primary_yoloV5.txt file

flmello commented 2 months ago

My board was already at MAXN model and my sink's sync property was also set to 0.

I changed FP32 to FP16 by setting network-mode=2. However I see just a marginal change, it was running at 15.99 fps and now it is at 16.18 fps. I had understood from #450 (comment) that it should have bustered the processing, but it did not. However, the model created was model_b4_gpu0_fp32.engine and I believe it should be model_b4_gpu0_fp16.engine. I checked my script was changing network-mode to 0, but there is no set_property like that.

marcoslucianops commented 2 months ago

The network-mode should be set on the config_infer_primary file (according to your yolo model).

flmello commented 2 months ago

The network-mode should be set on the config_infer_primary file (according to your yolo model).

Yes, I am setting it to 2. Are there more properties to be set? I am using the config_infer_primary_yoloV5.txt, just like this:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR, 2=Gray
model-color-format=0

onnx-file=/home/ubuntu/EdgeServer/model/yolov5s.onnx
labelfile-path=/home/ubuntu/EdgeServer/model/labels_yolov5s.txt
num-detected-classes=80

batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2

interval=0
gie-unique-id=1
process-mode=1
network-type=0
cluster-mode=2
maintain-aspect-ratio=1
symmetric-padding=1
parse-bbox-func-name=NvDsInferParseYolo
#parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=/home/ubuntu/EdgeServer/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all]
nms-iou-threshold=0.45
pre-cluster-threshold=0.25
topk=300
flmello commented 2 months ago

There was a set_property overriding network-mode back to 0, instead of 2. Now, I get 4 cameras@25.3fps. 5 cameras@21.3fps.

As @marcoslucianops said before, and just to summarize, the solution is: 1) Set board to MAXN: sudo nvpmodel -m 0 2) At the script: sink.set_property("sync", 0) 3) At the config file: network-mode=2, #network-mode: 0=FP32, 1=INT8, 2=FP16 mode

Number 3) is the most relevant between those three actions.