open-mmlab / mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.
https://mmdetection3d.readthedocs.io/en/latest/
Apache License 2.0
5.36k stars 1.55k forks source link

Support BEVDet series #1524

Open HuangJunJie2017 opened 2 years ago

HuangJunJie2017 commented 2 years ago

Describe the feature Support BEVDet series in mmdetection3d

Motivation the most elegance and powerful paradigm for multi-camera 3D object detection

Related resources https://github.com/HuangJunJie2017/BEVDet

ZCMax commented 2 years ago

We'll consider to support BEVDet in the future, besides, multi-camera 3D object detection is also the task we focus on.

HuangJunJie2017 commented 2 years ago

Hi, I am undertaking this feature. Currently, rescaling\cropping\rotating of multi-view images has not been implemented in all mmlab repos, as far as I know. I wonder can mmdet3d support this before I reproduce BEVDet in this repo? Or I just support these features alone with BEVDet by constructing some new classes dubbed ResizeMultiView/RandomCropMultiView/RandomRotateMultivew?

ZCMax commented 2 years ago

@Tai-Wang Please have a look at this to give some suggestions.

Tai-Wang commented 2 years ago

Hi @HuangJunJie2017 , glad to hear that you are developing this feature. Actually, we have also implemented data augmentations for multi-view images but the implementation is simple and somehow inelegant. Do you have any ideas that can generally re-use 2D pipelines for multi-view cases? For example, design a pipeline that can package any 2D pipeline and extend it to a pipeline for multi-view images?

HuangJunJie2017 commented 2 years ago

Hi, @Tai-Wang I recommend using a wrapper like this:

` @PIPELINES.register_module() class MultiViewWrapper():

def __init__(self, Transform):
    self.t = build_from_cfg(Transform, PIPELINES)

def __call__(self, input_dict):
    for img_id in range(len(input_dict['img'])):
        process_dict = dict(img=input_dict['img'][img_id])
        process_dict = self.t(process_dict)
        input_dict['img']['img_id'] = process_dict['img']
    return input_dict

`

Tai-Wang commented 2 years ago

Yes, I also think this would be a better implementation. Because we are doing some large refactoring, I recommend you first implement this feature just following your idea and provide a PR version for discussion. Our support for multi-view-related features would be released afterward and may help refactor your PR simultaneously.

Could you please provide an email registered in slack? We can discuss more details there.

HuangJunJie2017 commented 2 years ago

junjie.huang@ieee.org is ok

HuangJunJie2017 commented 2 years ago

By the way, random rotation is missing in mmdetection.... 'Rotate' only supports rotating images on a fixed scale~

Tai-Wang commented 2 years ago

By the way, random rotation is missing in mmdetection.... 'Rotate' only supports rotating images on a fixed scale~

Have sent the invitation. We can consider contributing the necessary pipelines to mmdet or first experimentally adding it in mmdet3d.