autowarefoundation / autoware.universe

https://autowarefoundation.github.io/autoware.universe/
Apache License 2.0
874 stars 564 forks source link

Implement RTMDet to Perception Pipeline #7235

Open StepTurtle opened 3 weeks ago

StepTurtle commented 3 weeks ago

Checklist

Description

We plan to add the RTMDet model in addition to the existing YOLOX model in Autoware Universe. While the YOLOX model is successful in the bounding box task, its instance segmentation layer is weak. We aim to improve instance segmentation results by adding RTMDet model.

208044554-1e8de6b5-48d8-44e4-a7b5-75076c7ebb71

208070055-7233a3d8-955f-486a-82da-b714b3c3bbd6

Purpose

Our goal is to enhance the lidar image fusion pipeline by adding the RTMDet model to Autoware for image segmentation.

Possible approaches

We can convert pre-trained PyTorch models to ONNX and TensorRT formats, and we can create a ROS 2 package to handle the TensorRT models in Autoware Universe perception to implement them in Autoware.

Definition of done

StepTurtle commented 3 weeks ago

Here is the results of pre-trained models shared in this link from mmdetection.

Results:

Model Score Threshold NMS Threshold Detection Time Per Image Video Link
RTMDet-Ins-s 0.3 0.3 ~20 ms Video Link
RTMDet-Ins-x 0.3 0.3 ~33 ms Video Link

For now, I tested the pre-trained models shared by mmdetection using mmdetection tools. Also they provide a couple of tools to convert .pth models to .onnx and .engine models. I check both of the converted model results and they are looking same and I think we can say their tools clearly convert the models.

Right now I am trying to handle how can use TensorRT models in cpp with TensorRT libraries.

StepTurtle commented 3 weeks ago

I deploy TensorRT engine to Python and I get some consistent results.

Model Score Threshold NMS Threshold Detection Time Per Image Video Link
RTMDet-Ins-s (TensorRT) 0.3 NO NMS ~20 ms Video Link
RTMDet-Ins-x (TensorRT) 0.3 NO NMS ~36 ms Video Link

[!WARNING] In some parts of the video, you may see incorrect class names. For example, you might see both truck and car class names assigned to a vehicle. This is because I didn't run NMS when I deployed it in Python. I plan to fix this when I deploy it in C++.


So, my next plan is doing same things in C++.

Also, the detection times looks a little bit more. I did not understand the reason right now but I am working on it.

StepTurtle commented 2 weeks ago

I am sharing the results from TensorRT deployment to C++, but currently, there are some issues with the results.

I cannot see exactly the same results as with Python deployment. When I check the bounding box and score results, everything appears to be the same as in Python. However, when I check the labels, the results are very different. I am currently trying to resolve this issue.

Model Score Threshold NMS Threshold Detection Time Per Image Video Link
RTMDet-Ins-s (TensorRT) 0.3 NO NMS ~8 ms Video Link
RTMDet-Ins-x (TensorRT) 0.3 NO NMS ~22 ms Video Link

You can find the scripts I used for deployment at these links:

StepTurtle commented 2 weeks ago

Fatih suggest to:

StepTurtle commented 3 days ago

I just started to deploy RTMDet to ROS 2. Here is the repository: https://github.com/leo-drive/tensorrt_rtmdet/

Model Score Threshold NMS Threshold Video Link
RTMDet-Ins-s 0.3 NO NMS Video Link
RTMDet-Ins-x 0.3 NO NMS Video Link

I plan to complete the porting to ROS2 by performing the following steps: