dvlab-research / FocalsConv

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)
https://arxiv.org/abs/2204.12463
Apache License 2.0
368 stars 35 forks source link

Occlusion between different classes in multi modal GT-Sampling. #11

Closed rkotimi closed 2 years ago

rkotimi commented 2 years ago

Hello @yukang2017 ! I notice that your FocalsConv has been supported by OpenPCDet. Congratulations!

After reading your code implementation of multi modal GT-Sampling, I think you only deal with the occlusion of objects belong to the same class on the image, which may case severe occlusion between different classes. Is it a potential bug? May it lead to suboptimal performance?

yukang2017 commented 2 years ago

Hi! Multi-modal gt-sampling indeed will cause occlusions in images (as the image blew). It indeed leads suboptimal performance on some categories on nuScenes. You can refer to the Table S - 16 in the paper.

I currently has no idea to solve this problem elegantly. We might reply on pre-trained instance segmentation models to copy paste mask instead of boxes to relieve the occlusion. But this inevitably complicates the pipeline.

image