Open cyn-liu opened 6 months ago
Great,maybe you can make a todo task list first and see what others can take part in
We refer to this project and successfully ran it on our own machine. We use RTX3080 GPU and Trt FP16 inference BEVDet-R50-4DLongterm-Depth model. The mAP and inference speed of BEVDet-R50-4DLongterm-Depth TensorRT version can refer this project link. The following is the running results on our machine:
https://github.com/autowarefoundation/autoware/assets/104069308/af71df5b-7776-425e-8720-0d7244847a54
The following is the inference speed on our machine:
https://github.com/autowarefoundation/autoware/assets/104069308/ec14066c-86a3-4a08-accd-b9690cf2d692
Next, we will modify ROS1 node to ROS2 node based on this project, then, we will use TIER IV's dataset for testing, and we hope that this dataset can provide ROS2 bag format.
Our plan of integrate the BEVDet ROS2 node into Autoware:
bevdet_node
in Autoware perception moduleautoware_perception_msgs::msg::DetectedObjects
typebevdet_node
into the object_merger
node and fuse it with the detection results of other modelsEnvironment: CUDA11.3.1 cudnn- linux-x86_64-8.8.1.3_cuda11 TensorRT-8.5.1.7.Linux.x86_64-gnu
Maybe try with AWSIM data
list the cuda env here
Using the BEVDet model to infer the TIER4 dataset, it was found that the model had poor generalization performance on the TIER4 dataset.
(1)
(2)
Looks like the original pre-trian(based on nuScenes dataset) model‘s generalization on tire4 dataset is not as well as we expected. Obstacles's direction is almost right but the depth of them ge
we plan to close this task once we have the node tested. And creat a new task of "retrain the model" to see whether the new model’s performance on tire4 dataset increase.
Our plan of integrate the BEVDet ROS2 node into Autoware:
- define a
bevdet_node
in Autoware perception module- organize the 3D boxes results into
autoware_perception_msgs::msg::DetectedObjects
type- input the output result of
bevdet_node
into theobject_merger
node and fuse it with the detection results of other models
Considering that running the BEV 3D detection algorithm based on multi-cameras and the Lidar based 3D detection algorithm simultaneously is too heavy a load. we have decided not to merge the results of BEVDet with the output results of Lidar, but to create a new perception_mode
, when perception_mode = camera
, launch bevdet_node
.
@xmfcx The PR related this issue has been successfully tested in the newer Autoware docker image. The environment information of this image:
CUDA==12.3
libnvinfer==8.6.1.6
Note: Outside in docker, I must upgrade to my nvidia GPU driver version to ensure that this driver supports a maximum CUDA version >= 12.3.
Checklist
Description
BEVDet is a BEV perception algorithm based on panoramic cameras. It unifies multi-view images into the perspective of BEV for 3D object detection task. It is different from the current 3D perception feature of Autoware. BEVDet code repos
Purpose
Integrating BEVDet into Autoware for 3D object detection based on multi-view images, this task related to Sensing& Perception task.
Possible approaches
BEVDet is a 3D object detection model trained on NuScenes dataset using 6 surround view camera images. The 6 cameras form a 360 degree field of view with overlapping fields of view. When mapping from 2D to 3D, some parameters are required, including camera intrinsic parameters and extrinsic parameters between each camera and ego. Integrating BEVDet into Autoware involves the placement of 6 cameras and calibration. Convert BEVDet model into ONNX format for deployment in Autoware.
Definition of done