pangsu0613 / CLOCs

CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection
MIT License
351 stars 68 forks source link

Fusion of multiple 3D detection candidates. #22

Closed vignesh628 closed 3 years ago

vignesh628 commented 3 years ago

Hello @pangsu0613 , Can we extend this approach to the fusion of multiple 3D object detection candidates(3D and 3D) instead of (2D and 3D). If yes, what are the necessary changes to be made in the current implementation ??

Few more queries:

  1. CLOCS works on assumption that the Bounding Box (Output Detection) from Lidar is more accurate than camera as we are updating the final output with bounding boxes of LIDAR. But there are many instances where the lidar can fail like for example when there is high reflectance . How do we overcome this case ??

  2. Are there any other methods that you have come across where the bounding box refinement also happens in the fusion ??

pangsu0613 commented 3 years ago

Hi @vignesh628, I think CLOCs can be extended to 3D and 3D fusion, you need to change the 2D IoU used in original CLOCs into 3D IoU or BEV IoU.

1.CLOCS works on assumption that the Bounding Box (Output Detection) from Lidar is more accurate than camera as we are updating the final output with bounding boxes of LIDAR. But there are many instances where the lidar can fail like for example when there is high reflectance . How do we overcome this case ??

Answerr: For the situation that LiDAR failed, CLOCs can not fix that. But based on our experience, for detecting vehicles on the road, high reflectance can hardly lead to LiDAR failure, may I know what kind of LiDAR you are using?

2.Are there any other methods that you have come across where the bounding box refinement also happens in the fusion ??

Answer: You could have a look at the KITTI 3D/BEV detection leaderboard, there are some fusion-based methods there.

vignesh628 commented 3 years ago

@pangsu0613 Thanks for the information.

  1. I don't have the complete information on the type of lidar that we gonna use . Will query you accordingly.

As we know in CLOCS bounding box refinement is not activated as you said this functionality(Bounding Box Refinement ) was increasing the error. Link : - https://github.com/pangsu0613/CLOCs/issues/21#issuecomment-802169828

  1. Is this functionality (Bounding Box refinement) is already implemented and can be activated in code provided ?? If yes which part of code needs to be activated for that ?? If not what are the necessary changes that need to be made to implement this functionality.

Thanks Once Again...

pangsu0613 commented 3 years ago

@vignesh628 , we don't have a specific bbox regression function for CLOCs, but since CLOCs is based on SECOND, for the fusion of SECOND and cascade-rcnn, we have tested incorporating SECOND bbox regression module. For the bbox regression, it is not one function that can do the job, the raw output (for bbox regression) from SECOND network is "preds_dict["box_preds"]" (you can find this in voxelnet.py). The raw outputs from SECOND are the offsets between predicted bboxes and anchor boxes, then you need to build the loss function using these variables and train them.

vignesh628 commented 3 years ago

Thank you for the clarification @ pangsu. Actually we can do the 2 d object detection itself right in autonomous driving . Are there any major reasons why we are doing 3 d object detection leaving 2d object detection behind in autonomous driving ?

On Wed, Mar 24, 2021, 20:00 Su Pang @.***> wrote:

@vignesh628 https://github.com/vignesh628 , we don't have a specific bbox regression function for CLOCs, but since CLOCs is based on SECOND, for the fusion of SECOND and cascade-rcnn, we have tested incorporating SECOND bbox regression module. For the bbox regression, it is not one function that can do the job, the raw output (for bbox regression) from SECOND network is "preds_dict["box_preds"]" (you can find this in voxelnet.py). The raw outputs from SECOND are the offsets between predicted bboxes and anchor boxes, then you need to build the loss function using these variables and train them.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pangsu0613/CLOCs/issues/22#issuecomment-805869343, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGI7WJ6H2ZAPWW3CSQASKBTTFHZRBANCNFSM4ZSLKTSQ .

pangsu0613 commented 3 years ago

@vignesh628 , my understanding is that 2D object detection itself is not enough, especially when you consider path planning and control in your self-driving platform. 2D detection may be ok for many driving scenarios, but it will fail in many others. In most of scenarios, the autonomous driving system needs to know the accurate 3D or BEV info of all objects in the environments to make path plannning and control decisions. That's why all autonomous driving companies (except for Tesla, whose system is more like an ADAS system, not a fully self-driving system) are using 3D LiDAR and other sensors to do 3D object detection. Computers are not as intelligent as humans, we could have a great estimation about the 3D velocities and distances of other objects from our own "cameras" (eyes), but currently computers can not.