marcoslucianops / DeepStream-Yolo

NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models
MIT License
1.39k stars 344 forks source link

CUDA failure: too many resources requested for launch in file yoloPlugins.cpp at line 255 #418

Open sperezs95 opened 11 months ago

sperezs95 commented 11 months ago

Hello dear @marcoslucianops, first of all I would like to thank you for such an excellent job in this repository, it has been very helpful.

My environment is the following:

image image

I'm working inside NVIDIA's deepstream docker container: nvcr.io/nvidia/deepstream-l4t:6.0.1-triton

I am working on a custom LPR, first of all, I followed the instructions you propose to deploy a YoLoV7 with deepstream and everything works correctly, that is, I have the RTSP video output with the detections and tracks ids in my app with deepstream with python bindings.

image

Now I am trying to add the part of a SGIE, I have a custom model in YoLoV2 darknet to detect the license plates of the vehicles, as a first test I am trying to use it as a PGIE (replacing the respective config of my yolov7); I am basing myself on your “config_infer_primary_yoloV2.txt”, I have indicated the paths to my .cfg and .weights files, I have changed the batch size and the path to the compiled “nvdsinfer_custom_impl_Yolo” plugin.

When launching my app.py I get the following error:

NvMMLiteOpen : Block : BlockType = 4 ===== NVMEDIA: NVENC ===== NvMMLiteBlockCreate : Block : BlockType = 4 CUDA failure: too many resources requested for launch in file yoloPlugins.cpp at line 255 Aborted (core dumped)

Additionally I have tried to get my PGIE working with the original YoLoV2 and I get the same error.

This is my configuration file:

[property] gpu-id=0 net-scale-factor=0.0039215697906911373 model-color-format=0 custom-network-config=/lpr/models/yolov2/1/yolov2.cfg model-file=/lpr/models/yolov2/1/yolov2.weights model-engine-file=/lpr/model_b2_gpu0_fp16.engine

int8-calib-file=calib.table

labelfile-path=/lpr/data/coco_classes.txt batch-size=2 network-mode=2 num-detected-classes=80 interval=0 gie-unique-id=1 process-mode=1 network-type=0 cluster-mode=2 maintain-aspect-ratio=0

force-implicit-batch-dim=1

workspace-size=1000

parse-bbox-func-name=NvDsInferParseYolo

parse-bbox-func-name=NvDsInferParseYoloCuda

custom-lib-path=/lpr/DeepStream-Yolo/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so engine-create-func-name=NvDsInferYoloCudaEngineGet [class-attrs-all] nms-iou-threshold=0.45 pre-cluster-threshold=0.25 topk=300

I would really appreciate if you can help me with this problem since I need to make that YoloV2 model work.

greetings and stay tuned.

marcoslucianops commented 11 months ago

Hi, I will have a Jetson Nano to test in 18/08. Is it possible for you to wait?

sperezs95 commented 11 months ago

Hi, I will have a Jetson Nano to test in 18/08. Is it possible for you to wait?

@marcoslucianops Yes, it could wait, any help is welcome, do you have a jetson nano or jetson nano orin?

sperezs95 commented 11 months ago

UPDATE: YoloV2 works in with Jetson Xavier NX Developer Kit - Jetpack 4.6 [L4T 32.6.1], inside docker container nvcr.io/nvidia/deepstream-l4t:6.0-triton.

marcoslucianops commented 11 months ago

The problem is the number of threads in the cuda kernel on Jetson Nano. I have the old Jetson Nano. I need to check the number for it and how to change based on the board.