open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.74k stars 629 forks source link

parseBoundingBox issue with onnx converted model in deepstream pipeline #726

Closed ExcaliburKG closed 1 year ago

ExcaliburKG commented 2 years ago

Hi, I'm trying to use mmdeploy to convert a model to TRT engine and use it in a nvidia deepstream pipeline.

Main config atss detector: https://github.com/open-mmlab/mmdetection/blob/master/configs/atss/atss_r101_fpn_1x_coco.py Weights: https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r101_fpn_1x_coco/atss_r101_fpn_1x_20200825-dfcadd6f.pth Next, I use mmdeploy script:

python ./mmdeploy/tools/deploy.py  \
./mmdeploy/configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py \
atss_r101_fpn_1x_coco.py \
atss_r101_fpn_1x_20200825-dfcadd6f.pth \
640640.jpg \
    --work-dir outdir\
    --device cuda:0\
    --log-level INFO\
    --show\
    --dump-info

Conversion is done without critical issues, I get engine and onnx files, but when I run deepstream pipeline I get the following error:

Running...
0:00:02.932355289   357 0x56408eaa0460 ERROR                nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::parseBoundingBox() <nvdsinfer_context_impl_output_parsing.cpp:59> [UID = 1]: Could not find output coverage layer for parsing objects
0:00:02.932389425   357 0x56408eaa0460 ERROR                nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::fillDetectionOutput() <nvdsinfer_context_impl_output_parsing.cpp:735> [UID = 1]: Failed to parse bboxes
Segmentation fault (core dumped)

Please clarify should one of the following files contain parseBoundingBox function that model needs to work properly:

  1. onnx file
  2. engine file
  3. custom plugin libmmdeploy_tensorrt_ops.so (bundled with mmdeploy)

So after the conversion I do not need to write any additional code for an onxx model to work in deepstream pipeline. My pgie config is:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
onnx-file=end2end.onnx
model-engine-file=end2end.engine
#force-implicit-batch-dim=1
custom-lib-path=/mmdeploy/lib/libmmdeploy_tensorrt_ops.so
batch-size=1
network-mode=1
num-detected-classes=80
interval=0
gie-unique-id=1
#scaling-filter=0
#scaling-compute-hw=0
cluster-mode=4

[class-attrs-all]
pre-cluster-threshold=0.2
topk=20
nms-iou-threshold=0.5
lvhan028 commented 2 years ago

I quickly browse https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_using_custom_model.html I think you probably need to implement Custom Output Parsing The libmmdeploy_tesorrt_ops.so is actually IPlugin Implementation

Since I am not familiar with NVIDIA DeepStream, please allow me to take some time to investigate it. I'll get back to you as soon as I figure it out

ziggy84 commented 2 years ago

Hi, I've had a similar issue using a yolox head in deepstream. The solution is to add a simple parser to libmmdeploy_tensorrt_ops.so which is similar to mmdetparser. This needs to be in the same .so file as Deepstream only seems to be able to load one library/plugin. As the parser is only needed in DeepStream and not other TensorRT applications would it make sense to create a libmmdeploy_deepstream_ops.so which includes the tensorRT ops as well as additional parsers?

ziggy84 commented 2 years ago

@lvhan028 If this something you can help with? I'm like to add my parser to MMDeploy.

lvhan028 commented 2 years ago

Hi, @ziggy84 mmdeploy focuses on how to deploy PyTorch models in various devices. We would like to leave mmdeploy integration to the community's repo. It will be our honor to note down your repo to awesome works based on mmdeploy if you would like to opensource your own repo addressing integrating mmdeploy to deepstream or triton.

github-actions[bot] commented 1 year ago

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

KleinYuan commented 1 year ago

@lvhan028 could you share your parser ? and yes, it will be nice if you can add yours to mmdeploy, a PR would be very helpful. I am struggling to do this as well for faster-rcnn.

ajlorenzo1315 commented 4 months ago

@ziggy84 could you share your parser ?