Xianpeng919 / MonoCon

Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection (AAAI'22)
146 stars 23 forks source link

Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection

By Xianpeng Liu, Nan Xue and Tianfu Wu

Introduction

This repository includes an official implementation of the paper 'Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection', and an unofficial implementation of the excellent work MonoDLE. Motivated by the Cramer-Wold theorem in measure theory, MonoCon proposes a simple yet effective formulation for monocular 3D object detection without exploiting any extra information (lidar, depth, CAD, sequential, etc.). It proposes to learn Monocular Contexts, as auxiliary tasks in training, to help monocular 3D object detection. The key idea is that with the annotated 3D bounding boxes of objects in an image, there is a rich set of well-posed projected 2D supervision signals available in training, such as the projected corner keypoints and their associated offset vectors with respect to the center of 2D bounding box, which should be exploited as auxiliary tasks in training. It outperforms prior arts in terms of both accuracy and speed.

vis1 vis2

Usage

Installation

This repo is tested on python=3.6, cuda=10.1, pytorch=1.5.1, mmcv-full=1.3.1, mmdetection=2.11.0, mmsegmentation=0.13.0 and mmdetection3D=0.14.0.

Note: mmdetection and mmdetection3D have made huge compatibility change in their latest versions. Their latest version is not compatible with this repo. Make sure you install the correct version. We will update our code and make it compatible with their latest versions in the future, please stay tuned.

Follow instructions below to install:

conda create -n monocon python=3.6
conda activate monocon
git clone https://github.com/Xianpeng919/MonoCon
cd MonoCon
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.1 -c pytorch
pip install mmcv-full==1.3.1 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.5.0/index.html
cd ./mmdetection-2.11.0
pip install -r requirements/build.txt
pip install -v -e .
cd ..
pip install mmsegmentation==0.13.0
cd ./mmdetection3d-0.14.0
pip install -v -e .
cd ..
pip install timm
pip uninstall pycocotools
pip uninstall mmpycocotools
pip install mmpycocotools
cp -rT mmdetection-2.11.0-extra-monocon mmdetection-2.11.0/
cp -rT monocon mmdetection3d-0.14.0/

Data Preparation

Download KITTI dataset and organize data following the official instructions in mmdetection3D. Then generate data by running:

cd ./mmdetection3d-0.14.0
python create_data_tools_monocon/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti

Training

# dir: ./mmdetection3d-0.14.0
CUDA_VISIBLE_DEVICES=0  python ./tools/train.py configs/monocon/monocon_dla34_200e_kitti.py

Inference and Evaluation

# dir: ./mmdetection3d-0.14.0
CUDA_VISIBLE_DEVICES=0 python ./tools/test.py configs/monocon/monocon_dla34_inference_200e_kitti.py ./work_dirs/ur_ckpt_location --eval bbox

We provide pre-trained checkpoints. MonoDLE*'s model, config and pretrained weight will be released soon. See the below table to check the performance. (Inference speed is tested on Nvidia 2080Ti)

[Update 06/23]: We have updated more checkpoints and training logs of MonoCon for the purpose of supporting reproducible research.

AP40@Easy AP40@Mod. AP40@Hard FPS/log Link
MonoCon (paper) 26.33 19.03 16.00 40 Model
MonoCon (reproduced #1) 25.09 18.91 15.99 Log1 Model
MonoCon (reproduced #2) 25.99 18.98 16.13 Log2 Model
MonoCon (reproduced #3) 25.86 18.78 16.00 Log3 Model
MonoCon (reproduced #4) 25.21 18.74 15.87 Log4 Model
MonoCon (reproduced #5) 25.92 19.08 16.03 Log5 Model
Average 25.74 18.92 16.00 - -

License

This project is released under the Apache 2.0 license.

Citation

Please consider citing our paper in your publications if it helps your research.

@InProceedings{liu2022monocon,
    title={Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection},
    author={Xianpeng Liu, Nan Xue, Tianfu Wu},
    booktitle = {36th AAAI Conference on Artifical Intelligence (AAAI)},
    month = {Feburary},
    year = {2022}
}

@inproceedings{li2020attentive,
  title={Attentive normalization},
  author={Li, Xilai and Sun, Wei and Wu, Tianfu},
  booktitle={European Conference on Computer Vision},
  pages={70--87},
  year={2020},
  organization={Springer}
}

Acknowledgement

This repo benefits from awesome works of mmdetection, mmdetection3D, MonoDLE, MonoFlex, RTM3D. Please also consider citing them.

Related Links

[Update 06/23]: Here are some reproduced version on MonoCon: monocon-pytorch, MonoDetector.

Contact

If you have any question about this project, please feel free to contact xliu59@ncsu.edu