marcoslucianops / DeepStream-Yolo

NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models
MIT License
1.39k stars 344 forks source link

Issues running yoloV8 with multiple streams #450

Closed mgabell closed 10 months ago

mgabell commented 10 months ago

Hi,

I tried running two streams with yoloV8 with the deepstream-test3 case. It can start 1 stream (mp4 video) but adding one more source it cant. Any restriction to the config file/model that I need to consider? Batch-size is managed automatically by the program

marcoslucianops commented 10 months ago

What's the error?

mgabell commented 10 months ago

Hi, This is once the pipeline is generated:

Now playing... 0 : file:///home/xxx/Development/output_video.mp4 1 : file:///home/xxx/Development/output_video_1.mp4 Starting pipeline

WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors. Deserialize yoloLayer plugin: yolo 0:00:04.822380066 14554 0x14c62490 INFO nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() [UID = 1]: deserialized trt engine from :/home/aiadmin/Development/deepstream-test3/model_b1_gpu0_fp32.engine INFO: [Implicit Engine Info]: layers num: 5 0 INPUT kFLOAT data 3x640x640
1 OUTPUT kFLOAT num_detections 1
2 OUTPUT kFLOAT detection_boxes 8400x4
3 OUTPUT kFLOAT detection_scores 8400
4 OUTPUT kFLOAT detection_classes 8400

0:00:05.033968190 14554 0x14c62490 WARN nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() [UID = 1]: Backend has maxBatchSize 1 whereas 2 has been requested 0:00:05.034041983 14554 0x14c62490 WARN nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() [UID = 1]: deserialized backend context :/home/aiadmin/Development/deepstream-test3/model_b1_gpu0_fp32.engine failed to match config params, trying rebuild 0:00:05.052300070 14554 0x14c62490 INFO nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() [UID = 1]: Trying to create engine from model files YOLO config file or weights file is not specified

ERROR: Failed to create network using custom network creation function ERROR: Failed to get cuda engine from custom library API 0:00:07.391628367 14554 0x14c62490 ERROR nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() [UID = 1]: build engine file failed 0:00:07.606088256 14554 0x14c62490 ERROR nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() [UID = 1]: build backend context failed 0:00:07.607372073 14554 0x14c62490 ERROR nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() [UID = 1]: generate backend failed, check config file settings 0:00:07.609294423 14554 0x14c62490 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Failed to create NvDsInferContext instance 0:00:07.609316952 14554 0x14c62490 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Config file path: config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

**PERF: {'stream0': 0.0, 'stream1': 0.0}

Error: gst-resource-error-quark: Failed to create NvDsInferContext instance (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(888): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-inference: Config file path: config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

marcoslucianops commented 10 months ago

YOLO config file or weights file is not specified

This is the error

marcoslucianops commented 10 months ago

Please check the model path in the config_infer_primary_yoloV8.txt file.

mgabell commented 10 months ago

What should I check for? The model yolov8s.onnx is located in the same folder as the script I run. This file works with 1 media file.

This is the property section of config file: property]

gpu-id=0

net-scale-factor=0.0039215697906911373 model-color-format=0 onnx-file=yolov8s.onnx model-engine-file=model_b1_gpu0_fp32.engine

int8-calib-file=calib.table

labelfile-path=labels.txt batch-size=1 network-mode=0 num-detected-classes=6 interval=0 gie-unique-id=1 process-mode=1 network-type=0 cluster-mode=2 maintain-aspect-ratio=1 symmetric-padding=1

force-implicit-batch-dim=1

workspace-size=1000

parse-bbox-func-name=NvDsInferParseYolo

parse-bbox-func-name=NvDsInferParseYoloCuda

custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so engine-create-func-name=NvDsInferYoloCudaEngineGet

The start line: python3 deepstream_test_3.py -i file:///home/aiadmin/Development/output_video.mp4 file:///home/aiadmin/Development/output_video_1.mp4

marcoslucianops commented 10 months ago

Is the yolov8s.onnx located in the same directory of the config_infer_primary_yoloV8.txt file?

mgabell commented 10 months ago

Is the yolov8s.onnx located in the same directory of the config_infer_primary_yoloV8.txt file?

Yes

mgabell commented 10 months ago

This works: python3 deepstream_test_3.py -i file:///home/xxx/Development/output_video.mp4

This does not: python3 deepstream_test_3.py -i file:///home/xxx/Development/output_video.mp4 file:///home/aiadmin/Development/output_video_1.mp4

marcoslucianops commented 10 months ago

You are using old nvdsinfer_custom_impl_Yolo that doens't supports ONNX model. Please update the model with the updated exporter and the DeepStream-Yolo to the latest version, and try agin.

mgabell commented 10 months ago

You are using old nvdsinfer_custom_impl_Yolo that doens't supports ONNX model. Please update the model with the updated exporter and the DeepStream-Yolo to the latest version, and try agin.

How can I update that? I git cloned Deepstream-Yolo today I git cloned ultralyctics today.

I copied the export_yoloV8.py file from DeepStream-Yolo/utils directory to the ultralytics folder. then downloaded this: wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt

I exported to onnx. Then copied nvdsinfer_custom_impl_Yolo from your Deepstream-Yolo folder to my copy of deepstream-test3. So in this folder (deepstream-test3) I have ./nvdsinfer_custom_impl_Yolo

ERROR: Could not open lib: /home/aiadmin/Development/deepstream-test3/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so, error string: /home/aiadmin/Development/deepstream-test3/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so: cannot open shared object file: No such file or directory 0:00:00.297142754 18759 0x413c6c90 ERROR nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() [UID = 1]: Could not open custom lib: (null) 0:00:00.297205282 18759 0x413c6c90 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Failed to create NvDsInferContext instance 0:00:00.297220194 18759 0x413c6c90 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Config file path: config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CUSTOM_LIB_FAILED Error: gst-resource-error-quark: Failed to create NvDsInferContext instance (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(888): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-inference: Config file path: config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CUSTOM_LIB_FAILED Exiting app

mgabell commented 10 months ago

You are using old nvdsinfer_custom_impl_Yolo that doens't supports ONNX model. Please update the model with the updated exporter and the DeepStream-Yolo to the latest version, and try agin.

How can I update that? I git cloned Deepstream-Yolo today I git cloned ultralyctics today.

I copied the export_yoloV8.py file from DeepStream-Yolo/utils directory to the ultralytics folder. then downloaded this: wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt

I exported to onnx. Then copied nvdsinfer_custom_impl_Yolo from your Deepstream-Yolo folder to my copy of deepstream-test3. So in this folder (deepstream-test3) I have ./nvdsinfer_custom_impl_Yolo

ERROR: Could not open lib: /home/aiadmin/Development/deepstream-test3/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so, error string: /home/aiadmin/Development/deepstream-test3/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so: cannot open shared object file: No such file or directory 0:00:00.297142754 18759 0x413c6c90 ERROR nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() [UID = 1]: Could not open custom lib: (null) 0:00:00.297205282 18759 0x413c6c90 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Failed to create NvDsInferContext instance 0:00:00.297220194 18759 0x413c6c90 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Config file path: config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CUSTOM_LIB_FAILED Error: gst-resource-error-quark: Failed to create NvDsInferContext instance (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(888): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-inference: Config file path: config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CUSTOM_LIB_FAILED Exiting app

Ahh I must build that first...

mgabell commented 10 months ago

Closer, but still error

{'input': ['file:///home/aiadmin/Development/output_video.mp4', 'file:///home/aiadmin/Development/output_video_1.mp4'], 'configfile': None, 'pgie': None, 'no_display': False, 'file_loop': False, 'disable_probe': False, 'silent': False} Creating Pipeline

Creating streamux

Creating source_bin 0

Creating source bin source-bin-00 Creating source_bin 1

Creating source bin source-bin-01 Creating Pgie

Creating tiler

Creating nvvidconv

Creating nvosd

Creating nv3dsink

WARNING: Overriding infer-config batch-size 1 with number of sources 2

Adding elements to Pipeline

Linking elements in the Pipeline

Now playing... 0 : file:///home/aiadmin/Development/output_video.mp4 1 : file:///home/aiadmin/Development/output_video_1.mp4 Starting pipeline

WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors. Deserialize yoloLayer plugin: yolo Segmentation fault (core dumped)

marcoslucianops commented 10 months ago

Did you delete the old engine file?

mgabell commented 10 months ago

I erased model_b1_gpu0_fp32.engine and model_b2_gpu0_fp32.engine from the folder where I run this. Then I am back to:

Now playing... 0 : file:///home/aiadmin/Development/output_video.mp4 Starting pipeline

WARNING: Deserialize engine failed because file path: /home/aiadmin/Development/deepstream-test3/model_b1_gpu0_fp32.engine open error 0:00:05.075367678 20755 0x26213d50 WARN nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() [UID = 1]: deserialize engine from file :/home/aiadmin/Development/deepstream-test3/model_b1_gpu0_fp32.engine failed 0:00:05.291829861 20755 0x26213d50 WARN nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() [UID = 1]: deserialize backend context from engine from file :/home/aiadmin/Development/deepstream-test3/model_b1_gpu0_fp32.engine failed, try rebuild 0:00:05.291889797 20755 0x26213d50 INFO nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() [UID = 1]: Trying to create engine from model files WARNING: [TRT]: onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.

Building the TensorRT Engine

ERROR: [TRT]: 4: [network.cpp::validate::3062] Error Code 4: Internal Error (Network has dynamic or shape inputs, but no optimization profile has been defined.) Building engine failed

Failed to build CUDA engine ERROR: Failed to create network using custom network creation function ERROR: Failed to get cuda engine from custom library API 0:00:08.689616826 20755 0x26213d50 ERROR nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() [UID = 1]: build engine file failed 0:00:08.904002283 20755 0x26213d50 ERROR nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() [UID = 1]: build backend context failed 0:00:08.904065132 20755 0x26213d50 ERROR nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() [UID = 1]: generate backend failed, check config file settings 0:00:08.904111884 20755 0x26213d50 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Failed to create NvDsInferContext instance 0:00:08.904125196 20755 0x26213d50 WARN nvinfer gstnvinfer.cpp:888:gst_nvinfer_start: error: Config file path: config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

**PERF: {'stream0': 0.0}

Error: gst-resource-error-quark: Failed to create NvDsInferContext instance (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(888): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-inference: Config file path: config_infer_primary_yoloV8.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED Exiting app

marcoslucianops commented 10 months ago

Is the force-implicit-batch-dim=1 commented (#) in the your config_infer_primary_yoloV8.txt file?

mgabell commented 10 months ago

No Its active

marcoslucianops commented 10 months ago

Change to

#force-implicit-batch-dim=1
mgabell commented 10 months ago

Now waiting to build TensorRT. Guess it takes a while... Getting back soonish with results.

mgabell commented 10 months ago

MAGIC :+1: The thing works now with 2 files and inference. The only thing you might help me with now is why it takes about 5-10minutes to build the TensortRT engine Everytime? I thought it builds once and then it uses that if it exists?

Now playing... 0 : file:///home/aiadmin/Development/output_video.mp4 1 : file:///home/aiadmin/Development/output_video.mp4 Starting pipeline

0:00:05.700819765 22695 0x2f940e90 INFO nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() [UID = 1]: deserialized trt engine from :/home/aiadmin/Development/deepstream-test3/model_b1_gpu0_fp32.engine WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1. INFO: [Implicit Engine Info]: layers num: 4 0 INPUT kFLOAT input 3x640x640
1 OUTPUT kFLOAT boxes 8400x4
2 OUTPUT kFLOAT scores 8400x1
3 OUTPUT kFLOAT classes 8400x1

0:00:05.912797080 22695 0x2f940e90 WARN nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::checkBackendParams() [UID = 1]: Backend has maxBatchSize 1 whereas 2 has been requested 0:00:05.912869497 22695 0x2f940e90 WARN nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() [UID = 1]: deserialized backend context :/home/aiadmin/Development/deepstream-test3/model_b1_gpu0_fp32.engine failed to match config params, trying rebuild 0:00:05.933970299 22695 0x2f940e90 INFO nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() [UID = 1]: Trying to create engine from model files WARNING: [TRT]: onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped WARNING: [TRT]: Tensor DataType is determined at build time for tensors not marked as input or output.

Building the TensorRT Engine

WARNING: [TRT]: DLA requests all profiles have same min, max, and opt value. All dla layers are falling back to GPU

marcoslucianops commented 10 months ago

The build should be only one time and it can take long time. Just put the correct engine name in the model-engine-file=model_b1_gpu0_fp32.engine and it will not rebuild when you run the code.

Note: The deepstream_test_3.py file changes the pgie batch-size according to the number of streams. It will require a new build if the engine isn't configured to the same batch-size. I recommend you remove the https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/blob/master/apps/deepstream-test3/deepstream_test_3.py#L332-L335 lines and configure the batch-size and engine name directly in the config_infer_primary_yoloV8.txt file with the max number of sources in the batch-size, so it will not need to rebuild again.

mgabell commented 10 months ago

I will try that. Thank you very much for your help! I have worked on this for 2 weeks and turn inside out on myself... I cant tell you how good this is and how much I appreciate your support.

I will try fix the config-file as you say. If not you hear from me, but that shouldn't be too hard. I will do this for three v2l2 cameras so you might get to see me again!

mgabell commented 10 months ago

The thing works. I am over exited. Performance is now the issue. I will try a bit but it would probably be more of a nVidia Jetson problem than your YoloV8 implementation. If anyone succeed running three streams@1280x720 with yoloV8 inference on Jetson AGX Orin. Please let me know. I get frame-drops...

marcoslucianops commented 10 months ago

Use FP16 by setting network-mode=2 in the config_infer_primary_yoloV8.txt file and sink.set_property('sync', 0) in the deepstream_test_3.py file.

marcoslucianops commented 10 months ago

And run the command sudo nvpmodel -m 0 to enable MAXN mode.

mgabell commented 10 months ago

And run the command sudo nvpmodel -m 0 to enable MAXN mode.

Use FP16 by setting network-mode=2 in the config_infer_primary_yoloV8.txt file and sink.set_property('sync', 0) in the deepstream_test_3.py file.

Super! I did both, I think I had nvp model at max already but recreating the engine file for FP16 really made a difference. I run 3x60FPS at 1280x720. I could not see any frame-drop. I will try convert to Int8 too but I think I have some issues with that. Thank you again for the support and the especially the FP16 setting. This really made a difference. I will benchmark this now on other Yolo models for medium and large as well as with fewer classes. I don't need 80 classes and perhaps that has positive impact on performance.

mgabell commented 9 months ago

A question. I know this is not correct forum but I give it a try. I managed to save the meta-data from the inference using python into an XML-file acc. to Pascal/VOC format. But I also want to save the frame itself. How can I do that? Is that done in the config-file?

I think I found it. For future reference and others: https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/blob/master/apps/deepstream-imagedata-multistream/deepstream_imagedata-multistream.py