happinesslz / EPNet

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection(ECCV 2020)
MIT License
235 stars 36 forks source link
3d-object-detection kitti-3d multimodal

EPNet

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection (ECCV 2020). Paper is now available in EPNet, and the code is based on PointRCNN.

Highlights

  1. Without extra image annotations, e.g. 2D bounding box, Semantic labels and so on.
  2. A more accurate multi-scale point-wise fusion for Image and Point Cloud.
  3. The proposed CE loss can improve the performance of 3D Detection greatly.
  4. Without GT AUG.

Contributions

This is Pytorch implementation for EPNet on KITTI dataset, which is mainly achieved by Liu Zhe and Huang Tengteng. Some parts also benefit from Chen Xiwu.

Abstract

In this paper, we aim at addressing two critical issues in the 3D detection task, including the exploitation of multiple sensors~(namely LiDAR point cloud and camera image), as well as the inconsistency between the localization and classification confidence. To this end, we propose a novel fusion module to enhance the point features with semantic image features in a point-wise manner without any image annotations. Besides, a consistency enforcing loss is employed to explicitly encourage the consistency of both the localization and classification confidence. We design an end-to-end learnable framework named EPNet to integrate these two components. Extensive experiments on the KITTI and SUN-RGBD datasets demonstrate the superiority of EPNet over the state-of-the-art methods.

image

Network

The architecture of our two-stream RPN is shown in the below. image

The architecture of our LI-Fusion module in the two-stream RPN. image

Install(Same with PointRCNN)

The Environment:

a. Clone the PointRCNN repository.

git clone https://github.com/happinesslz/EPNet.git

b. Install the dependent python libraries like easydict,tqdm, tensorboardX etc.

c. Build and install the pointnet2_lib, iou3d, roipool3d libraries by executing the following command:

sh build_and_install.sh

Dataset preparation

Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows:

EPNet
├── data
│   ├── KITTI
│   │   ├── ImageSets
│   │   ├── object
│   │   │   ├──training
│   │   │      ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │   ├──testing
│   │   │      ├──calib & velodyne & image_2
├── lib
├── pointnet2_lib
├── tools

Trained model

The results of Car on Recall 40:

LI Fusion CE loss Easy Moderate Hard mAP models
No No 88.76 78.03 76.20 80.99 Google, Baidu (a43t)
Yes No 89.93 80.77 77.25 82.65 Google, Baidu (dbxy)
No Yes 92.12 81.48 79.34 84.31 Google, Baidu (hrkv)
Yes Yes 92.17 82.68 80.10 84.99 Google, Baidu (nasm)

Besides, adding iou branch to EPNet (the last line in the above table) can bring a minor improvement and the results are more stable. The result is 92.50(Easy), 82.45(Moderate), 80.29(Hard), 85.08(mAP), and the model checkpoint can be obtained from Google, Baidu (8sir).

To evaluate all these models, please download the above models. Unzip these models and place them to "./log/Car/models"

cd ./tools
mkdir -p log/Car/models
bash run_eval_model.sh

Implementation

Training

Run EPNet for single gpu:

CUDA_VISIBLE_DEVICES=0 python train_rcnn.py --cfg_file cfgs/LI_Fusion_with_attention_use_ce_loss.yaml --batch_size 2 --train_mode rcnn_online --epochs 50 --ckpt_save_interval 1 --output_dir ./log/Car/full_epnet_without_iou_branch/   --set LI_FUSION.ENABLED True LI_FUSION.ADD_Image_Attention True RCNN.POOL_EXTRA_WIDTH 0.2 RPN.SCORE_THRESH 0.2 RCNN.SCORE_THRESH 0.2  USE_IOU_BRANCH False TRAIN.CE_WEIGHT 5.0

Run EPNet for two gpu:

CUDA_VISIBLE_DEVICES=0,1 python train_rcnn.py --cfg_file cfgs/LI_Fusion_with_attention_use_ce_loss.yaml --batch_size 6 --train_mode rcnn_online --epochs 50 --mgpus --ckpt_save_interval 1 --output_dir ./log/Car/full_epnet_without_iou_branch/   --set LI_FUSION.ENABLED True LI_FUSION.ADD_Image_Attention True RCNN.POOL_EXTRA_WIDTH 0.2 RPN.SCORE_THRESH 0.2 RCNN.SCORE_THRESH 0.2  USE_IOU_BRANCH False TRAIN.CE_WEIGHT 5.0

Testing

CUDA_VISIBLE_DEVICES=2 python eval_rcnn.py --cfg_file cfgs/LI_Fusion_with_attention_use_ce_loss.yaml --eval_mode rcnn_online  --eval_all  --output_dir ./log/Car/full_epnet_without_iou_branch/eval_results/  --ckpt_dir ./log/Car/full_epnet_without_iou_branch/ckpt --set  LI_FUSION.ENABLED True LI_FUSION.ADD_Image_Attention True RCNN.POOL_EXTRA_WIDTH 0.2  RPN.SCORE_THRESH 0.2 RCNN.SCORE_THRESH 0.2  USE_IOU_BRANCH False

Acknowledgement

The code is based on PointRCNN.

Citation

If you find this work useful in your research, please consider cite:

@article{Huang2020EPNetEP,
  title={EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection},
  author={Tengteng Huang and Zhe Liu and Xiwu Chen and Xiang Bai},
  booktitle ={ECCV},
  month = {July},
  year={2020}
}
@InProceedings{Shi_2019_CVPR,
    author = {Shi, Shaoshuai and Wang, Xiaogang and Li, Hongsheng},
    title = {PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2019}
}