emecercelik / ssl-3d-detection

39 stars 7 forks source link

3D Object Detection with a Self-supervised Lidar Scene Flow Backbone

3D Object Detection with a Self-supervised Lidar Scene Flow Backbone

This repository contains the implementation for 3D Object Detection with a Self-supervised Lidar Scene Flow Backbone.

Self-supervised scene flow pre-training pipeline for 3D object detection downstream task

Introduction

In this work, we propose using a self-supervised training strategy to learn a general point cloud backbone model for downstream 3D vision tasks. 3D scene flow can be estimated with self-supervised learning using cycle consistency, which removes labelled data requirements. Moreover, the perception of objects in the traffic scenarios heavily relies on making sense of the sparse data in the spatio-temporal context. Our main contribution leverages learned flow and motion representations and combines a self-supervised backbone with a 3D detection head focusing mainly on the relation between the scene flow and detection tasks. In this way, self-supervised scene flow training constructs point motion features in the backbone, which help distinguish objects based on their different motion patterns used with a 3D detection head.

We evaluate our method on Point-GNN, PointPillars, CenterPoint, and SSN 3D detectors.

For our self-supervised scene flow implementation on Point-GNN, please refer to the pointgnn folder. For the rest of the 3D detectors, please check mmdetection3d folder for details.

Citation

@article{erccelik20223d,
  title={3D Object Detection with a Self-supervised Lidar Scene Flow Backbone},
  author={Er{\c{c}}elik, Eme{\c{c}} and Yurtsever, Ekim and Liu, Mingyu and Yang, Zhijie and Zhang, Hanzhen and Top{\c{c}}am, P{\i}nar and Listl, Maximilian and {\c{C}}ayl{\i}, Y{\i}lmaz Kaan and Knoll, Alois},
  journal={arXiv preprint arXiv:2205.00705},
  year={2022}
}

Dataset

Point-GNN PointPillars CenterPoint SSN
KITTI
nuScenes

Acknowledgement

This repository is coded on top of Point-GNN, FlowNet3D, Just Go with the Flow: Self-supervised Scene Flow Estimation, and mmdetection3D.