marcoslucianops / DeepStream-Yolo

NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models
MIT License
1.45k stars 357 forks source link

Can't get bboxes on the output video stream and can't convert custom model to .wts format. #57

Closed dhirajpatnaik16297 closed 3 years ago

dhirajpatnaik16297 commented 3 years ago

I trained 2 custom models of yolov5s and yolov5m for number plate detection. Then I used the onnx models of both to check the detections in deepstream 5.1. I followed the steps provided but could not get the detections correct. Its detecting top left corner always as a number plate. It would be really helpful if I get a custom bbox parser function for this. temp_image_20210428_232813_b5dc47ce-ed84-4f7f-9188-42bb78625231

Also I have tried out the conversion to .wts but the custom model is not converting where as the pretrained yolov5s.pt is converted easily. Please let me know where I am going wrong? Thanks in advance

marcoslucianops commented 3 years ago

Hi,

You need to edit the yololayer.h and yolov5.cpp files in tensorrtx/yolov5 according to you custom model before generate engine. Copy the edited yololayer.h file from tensorrtx/yolov5 and replace the nvdsinfer_custom_impl_Yolo/yololayer.h file before compile.

dhirajpatnaik16297 commented 3 years ago

Hi Thanks for replying back. Do i need to do this for the onnx -> engine right? Also could you guide me through this what particular changes do i need to do? And kindly do let me know what should i do for converting custom model to .wts?? I am very new to this so your help is appreciated. Thanks

marcoslucianops commented 3 years ago

See: https://github.com/marcoslucianops/DeepStream-Yolo/blob/master/YOLOv5-5.0.md

dhirajpatnaik16297 commented 3 years ago

Yes. I am following it. There are 2 separate ways I am trying.

  1. Conversion of onnx -> engine. (For this I have posted the image).
  2. Conversion of .pt->.wts->engine(this I have tried but could not convert .pt -> .wts for the custom model).

For (2) I will surely follow what you said. Could you please suggest some ways to address the issue in (1).

marcoslucianops commented 3 years ago

Send me your output log from wts conversion

dhirajpatnaik16297 commented 3 years ago

Hi I am not able to get any log while conversion.

marcoslucianops commented 3 years ago

Sorry, I don't know about this error you are getting.

dhirajpatnaik16297 commented 3 years ago

Ok let me explain. I followed up the steps provided. Copied the gen_wts.py to my folder and used the custom model with one class best.pt to convert to best.wts but the best.wts is not formed although the program runs without any errors. Unlikely, when I try the pretrained model yolov5s.pt and convert using the python program, yolov5s.wts is formed. This is the issue. After I get the .wts file I shall do the changes to generate the tensorrt engine file. So how shall I get the .wts file for a custom model? First

marcoslucianops commented 3 years ago

Send your model file to my e-mail (available in my GitHub profile) and I will test it.

dhirajpatnaik16297 commented 3 years ago

You can access it from here. I have sent a mail too. https://drive.google.com/file/d/11-SZBr4rgXV3oZsB3f3FEu3Q8Xm4h0cM/view?usp=sharing

marcoslucianops commented 3 years ago

I sent the converted wts file to your email, please check.

dhirajpatnaik16297 commented 3 years ago

Thanks a lot. I will check and let you know .Please send the code so I can use it for other models. Or let me know what changes should I make.

marcoslucianops commented 3 years ago

Rename the best.pt to base model (yolov5s.wts for example) and run the command.

dhirajpatnaik16297 commented 3 years ago

Hi Marcos. Apologies for getting back so late. I was able to run it after renaming. Thanks a lot. Now i am trying to basically add an ocr model (same yolov5 model) to the output of the detector model. I followed the same process for conversion and was successful, then i added the secondary gie but am not able to run it. Could you please help out. Also i am trying to deploy the ocr model in the triton server in a g4 instance. If you could throw some light on that, that would be great. Thanks

marcoslucianops commented 3 years ago

See https://github.com/marcoslucianops/DeepStream-Yolo/issues/60#issuecomment-849568674

dhirajpatnaik16297 commented 3 years ago

Hi Marcos I have tried out what is mentioned. I am getting the following error::

deepstream-app -c deepstream_app_config.txt

Using winsys: x11 0:00:04.539955749 12050 0x1efdad90 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() [UID = 2]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-5.1/sources/yolov5/sgie1/yolov5s.engine INFO: [Implicit Engine Info]: layers num: 2 0 INPUT kFLOAT data 3x640x640
1 OUTPUT kFLOAT prob 6001x1x1

0:00:04.540151848 12050 0x1efdad90 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::generateBackendContext() [UID = 2]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-5.1/sources/yolov5/sgie1/yolov5s.engine 0:00:04.735425625 12050 0x1efdad90 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus: [UID 2]: Load new model:/opt/nvidia/deepstream/deepstream-5.1/sources/yolov5/./sgie1/config_infer_secondary1.txt sucessfully 0:00:05.565223558 12050 0x1efdad90 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-5.1/sources/yolov5/pgie/yolov5s.engine INFO: [Implicit Engine Info]: layers num: 2 0 INPUT kFLOAT data 3x640x640
1 OUTPUT kFLOAT prob 6001x1x1

0:00:05.565355958 12050 0x1efdad90 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-5.1/sources/yolov5/pgie/yolov5s.engine 0:00:05.574651522 12050 0x1efdad90 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus: [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-5.1/sources/yolov5/./pgie/config_infer_primary.txt sucessfully

Runtime commands: h: Print this help q: Quit

p: Pause
r: Resume

NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source. To go back to the tiled display, right-click anywhere on the window.

PERF: FPS 0 (Avg)
PERF: 0.00 (0.00)
** INFO: : Pipeline ready

Opening in BLOCKING MODE Opening in BLOCKING MODE NvMMLiteOpen : Block : BlockType = 261 NVMEDIA: Reading vendor.tegra.display-size : status: 6 NvMMLiteBlockCreate : Block : BlockType = 261 ** INFO: : Pipeline running

PERF: 13.09 (11.73)
/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform.cpp:3494: => VIC Configuration failed image scale factor exceeds 16, use GPU for Transformation 0:00:06.785237835 12050 0x1eade400 WARN nvinfer gstnvinfer.cpp:1277:convert_batch_and_push_to_input_thread: error: NvBufSurfTransform failed with error -2 while converting buffer ERROR from secondary_gie_0: NvBufSurfTransform failed with error -2 while converting buffer Debug info: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(1277): convert_batch_and_push_to_input_thread (): /GstPipeline:pipeline/GstBin:secondary_gie_bin/GstNvInfer:secondary_gie_0 Quitting 0:00:06.872989534 12050 0x1eade4f0 WARN nvinfer gstnvinfer.cpp:1984:gst_nvinfer_output_loop: error: Internal data stream error. 0:00:06.873062557 12050 0x1eade4f0 WARN nvinfer gstnvinfer.cpp:1984:gst_nvinfer_output_loop: error: streaming stopped, reason error (-5) ERROR from primary_gie: Internal data stream error. Debug info: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(1984): gst_nvinfer_output_loop (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie: streaming stopped, reason error (-5) ^C
ERROR: <_intr_handler:140>: User Interrupted..

Kindly let me know what is wrong. Thanks.

marcoslucianops commented 3 years ago

Add these lines in config_infer_secondary1.txt in [property] section

input-object-min-width=40
input-object-min-height=40
dhirajpatnaik16297 commented 3 years ago

Added these and got it running but not able to see output of the second model (OCR) in the output video. Its only detecting the number_plate. I have tried out reducing and enlarging text size but could not find anything. Also if i add batch-size to individual config_infer files i am getting an error so i removed. Please let me know where else i am going wrong. Thanks

marcoslucianops commented 3 years ago

Try lower values in these parameters I sent.

dhirajpatnaik16297 commented 3 years ago

No. Still no luck by even lowering the values or increasing the values. Lowering the values is making the program stuck after few seconds.

Please let me know where the problem lies. Thanks

marcoslucianops commented 3 years ago

Do below change in TensorRTX yololayer.cu file before conversion (engine generation) in your OCR model only.

lines 164 to:

    const char* YoloLayerPlugin::getPluginVersion() const
    {
        return "2";
    }

line 279 to:

    const char* YoloPluginCreator::getPluginVersion() const
    {
        return "2";
    }
dhirajpatnaik16297 commented 3 years ago

I have changed these in the sgie1 folder for deepstream. Did not help. I changed these and built the tensorrtx again. getting this when i use ./yolov5 -s yolov5s.wts yolov5s.engine s after making the changes. Loading weights: yolov5s.wts [06/18/2021-23:15:48] [E] [TRT] INVALID_ARGUMENT: getPluginCreator could not find plugin YoloLayer_TRT version 1 Segmentation fault (core dumped) Should it return the number of classes the model has?

marcoslucianops commented 3 years ago

Did you recompile the tensorrtx/yolov5?

dhirajpatnaik16297 commented 3 years ago

Yes i did. I cloned a new repo of tensorrtx, went to yolov5, made a build directory then "cmake .." and then "make" after the changes in yololayer.h and yololayer.cu. I have a doubt. In the changes you mentioned what is "2" in "return "2""? Is it the no. of classes?

marcoslucianops commented 3 years ago

It's the plugin version (id). If you have 2 plugins (2 models) with same version, it will bug in DeepStream (both models will use the same compiled lib). I will check about it.

dhirajpatnaik16297 commented 3 years ago

Ok sure. Please let me know. Thanks

marcoslucianops commented 3 years ago

Did you change the two lines (164 and 279)?

marcoslucianops commented 3 years ago

I found the problem, please change tensorrtx/yolov5/common.hpp line 272 to:

    auto creator = getPluginRegistry()->getPluginCreator("YoloLayer_TRT", "2");

Recompile and try again.

dhirajpatnaik16297 commented 3 years ago

Did you change the two lines (164 and 279)? Yes i did, both in tensorrtx and in deepstream files as well.

I found the problem, please change tensorrtx/yolov5/common.hpp line 272 to:

    auto creator = getPluginRegistry()->getPluginCreator("YoloLayer_TRT", "2");

Recompile and try again.

Thanks. Now the engine file is created but still no output of OCR model is shown in the video.

dhirajpatnaik16297 commented 3 years ago

Hi Marcos I am still not getting the OCR output in the output video even after making the changes. Must have closed the issue by mistake.

marcoslucianops commented 3 years ago

The YOLOv5 converted model has less accuracy than PyTorch model, maybe that's the problem.

dhirajpatnaik16297 commented 3 years ago

These are small models. Should training a medium or large model help?

marcoslucianops commented 3 years ago

You need to train and test.

dhirajpatnaik16297 commented 3 years ago

Ok will try for sure. Is there any other way like from .pt--> .onnx -> .engine? Reason being I have tested out the onnx model for small and medium yolov5 and have deployed in triton server in AWS. They work great but could not get them working for nano.

marcoslucianops commented 3 years ago

If you convert to onnx, it probably won't be able to run with files from this repo. The model output will change.

dhirajpatnaik16297 commented 3 years ago

Yes that is correct but could you please help out on that? Or provide some material from where i can take it forward.

marcoslucianops commented 3 years ago

You need to look at NVIDIA forums about output for ONNX model.