Implement RTMDet to Perception Pipeline

StepTurtle commented 3 weeks ago

Checklist

[X] I've read the contribution guidelines.
[X] I've searched other issues and no duplicate issues were found.
[X] I've agreed with the maintainers that I can plan this task.

Description

We plan to add the RTMDet model in addition to the existing YOLOX model in Autoware Universe. While the YOLOX model is successful in the bounding box task, its instance segmentation layer is weak. We aim to improve instance segmentation results by adding RTMDet model.

208044554-1e8de6b5-48d8-44e4-a7b5-75076c7ebb71

208070055-7233a3d8-955f-486a-82da-b714b3c3bbd6

Purpose

Our goal is to enhance the lidar image fusion pipeline by adding the RTMDet model to Autoware for image segmentation.

Possible approaches

We can convert pre-trained PyTorch models to ONNX and TensorRT formats, and we can create a ROS 2 package to handle the TensorRT models in Autoware Universe perception to implement them in Autoware.

Definition of done

[x] Use pre-trained PyTorch models to detection
[x] Convert pre-trained PyTorch models to ONNX and TensorRT format
[x] Deploy TensorRT model with Python
[x] Deploy TensorRT model with C++
[ ] Create a ROS 2 package in Autoware Universe which use ONNX models
[ ] Compare RTMDet results with YOLOX segmentation results
[ ] Decide how can we use the RTMDet segmentation results with image lidar fusion pipeline

StepTurtle commented 3 weeks ago

Here is the results of pre-trained models shared in this link from mmdetection.

Results:

I used PyTorch models to get these results (models with .pth extension)

Model	Score Threshold	NMS Threshold	Detection Time Per Image	Video Link
RTMDet-Ins-s	0.3	0.3	~20 ms	Video Link
RTMDet-Ins-x	0.3	0.3	~33 ms	Video Link

For now, I tested the pre-trained models shared by mmdetection using mmdetection tools. Also they provide a couple of tools to convert .pth models to .onnx and .engine models. I check both of the converted model results and they are looking same and I think we can say their tools clearly convert the models.

Right now I am trying to handle how can use TensorRT models in cpp with TensorRT libraries.

StepTurtle commented 3 weeks ago

I deploy TensorRT engine to Python and I get some consistent results.

Model	Score Threshold	NMS Threshold	Detection Time Per Image	Video Link
RTMDet-Ins-s (TensorRT)	0.3	NO NMS	~20 ms	Video Link
RTMDet-Ins-x (TensorRT)	0.3	NO NMS	~36 ms	Video Link

[!WARNING] In some parts of the video, you may see incorrect class names. For example, you might see both truck and car class names assigned to a vehicle. This is because I didn't run NMS when I deployed it in Python. I plan to fix this when I deploy it in C++.

So, my next plan is doing same things in C++.

Also, the detection times looks a little bit more. I did not understand the reason right now but I am working on it.

StepTurtle commented 2 weeks ago

I am sharing the results from TensorRT deployment to C++, but currently, there are some issues with the results.

I cannot see exactly the same results as with Python deployment. When I check the bounding box and score results, everything appears to be the same as in Python. However, when I check the labels, the results are very different. I am currently trying to resolve this issue.

Model	Score Threshold	NMS Threshold	Detection Time Per Image	Video Link
RTMDet-Ins-s (TensorRT)	0.3	NO NMS	~8 ms	Video Link
RTMDet-Ins-x (TensorRT)	0.3	NO NMS	~22 ms	Video Link

You can find the scripts I used for deployment at these links:

StepTurtle commented 2 weeks ago

Fatih suggest to:

check YOLOX detection time and compare with RTMDet.
test with datasets and compare scores, bounding boxes and so on with YOLOX.

StepTurtle commented 3 days ago

I just started to deploy RTMDet to ROS 2. Here is the repository: https://github.com/leo-drive/tensorrt_rtmdet/

Right now I have similar results with my previous works.
Unlike my previous work, it converts ONNX model to TensorRT model on first run and you do not need provide TensorRT model to this package, ONNX is enough.

Model	Score Threshold	NMS Threshold	Video Link
RTMDet-Ins-s	0.3	NO NMS	Video Link
RTMDet-Ins-x	0.3	NO NMS	Video Link

I plan to complete the porting to ROS2 by performing the following steps:

RTMDet uses custom TensorRT plugin and It is loading with dlopen() right now, find better way.
For now, pre-processing jobs works on CPU (with OpenCV), I should use CUDA to preprocessing.
Also post-processing runs on CPU. I will try to convert it CUDA.
There is no NMS right now, solve it
There are a lot of hard coded part right now, remove them.
There are three precision option to convert ONNX model to TensorRT model (fp16, fp32 and int8). int8 is not working.

autowarefoundation / autoware.universe