RookieXwc/GPICTURE - Githubissues

Mutual Information-Driven Self-supervised Point Cloud Pre-training

Weichen Xu, Tianhao Fu, Jian Cao, Xinyu Zhao, Xinxin Xu, Xixin Cao, Xing Zhang
Peking University, Beijing 100871, China

:books:Outline

Location of Key Codes
Main Results
Getting Start

:sparkles: Location of Key Codes

In pcdet/models/dense_heads/pretrain_head_3D_seal.py, we provide the implementations of High-level Voxel Feature Generation Module, which involves the processes of data extraction, voxelization, as well as the computation of the target and loss.

In pcdet/models/backbones_3d/I2Mask.py and pcdet/models/backbones_3d/dsvt_backbone_mae.py, we provide the implementations of Inter-class and Intra-class Discrimination-guided Masking (I$^2$Mask).

In pcdet/utils/cka_alhpa.pyand pcdet/models/dense_heads/pretrain_head_3D_seal.py L172, we provide the implementations of CKA-guided Hierarchical Reconstruction.

In pcdet/models/dense_heads/pretrain_head_3D_seal.py L167 def differential_gated_progressive_learning, we provide the implementations of Differential-gated Progressive Learning.

We provide all the configuration files in the paper and appendix in tools/cfgs/gpicture_models/.

👆 BACK to Table of Contents -->

:car: Main Results

Pre-training

Waymo

Model	Pre-train Fraction	Pre-train model	Log
GPICTURE (DSVT)	20%	ckpt	Log
GPICTURE (DSVT)	100%	ckpt	Log

nuScenes

Model	Pre-train model	Log
GPICTURE (DSVT)	ckpt	Log

Fine-tuning

3D Object Detection (on Waymo validation)

Model	Pre-train Fraction	mAP/H_L2	Veh_L2	Ped_L2	Cyc_L2	ckpt	Log
DSVT (GPICTURE)	20%	73.84/71.75	71.55/71.22	75.99/70.61	73.98/73.42	ckpt	Log
DSVT (GPICTURE)	100%	75.55/73.13	73.38/72.87	77.52/72.01	75.75/74.51	ckpt	Log

3D Object Detection (on nuScenes validation)

Model	mAP	NDS	mATE	mASE	mAOE	mAVE	mAAE	ckpt	Log
DSVT (GPICTURE)	68.6	73.0	25.5	23.8	25.8	20.7	17.4	ckpt	Log

3D Semantic Segmentation (on nuScenes validation)

Model	mIoU	Bicycle	Bus	Car	Motorcycle	Pedestrian	Trailer	Truck	ckpt	Log
Cylinder3D-SST (GPICTURE)	79.7	43.6	94.8	96.5	81.0	84.4	65.8	87.7	ckpt	Log

Occupancy Prediction (on nuScenes OpenOccupancy validation)

Model	mIoU	Bicycle	Bus	Car	Motorcycle	Pedestrian	Trailer	Truck	ckpt	Log
DSVT (GPICTURE)	18.8	8.4	16.2	21.1	7.9	12.8	15.9	16.3	ckpt	Log

👆 BACK to Table of Contents -->

🏃‍♂️Getting Start

⬇️1. Download Weights of MinkUNet (Res16UNet34C) Pre-trained by Seal

[youquanl/Segment-Any-Point-Cloud: NeurIPS'23 Spotlight] Segment Any Point Cloud Sequences by Distilling Vision Foundation Models (github.com)](https://github.com/youquanl/Segment-Any-Point-Cloud)

After downloading, please put it into project path

👆 BACK to Table of Contents -->

⚒️2. Prepare Dataset

Waymo：

1.Download the Waymo dataset from the official Waymo website, and make sure to download version 1.2.0 of Perception Dataset.

2.Prepare the directory as follows:

GPICTURE
├── data
│   ├── waymo
│   │   │── ImageSets
│   │   │── raw_data
│   │   │   │── segment-xxxxxxxx.tfrecord
|   |   |   |── ...
├── pcdet
├── tools

3.Prepare the environment and install waymo-open-dataset:

pip install waymo-open-dataset-tf-2-5-0

4.Generate the complete dataset. It require approximately 1T disk and 100G RAM.

# only for single-frame setting
python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos \
    --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml

# for single-frame or multi-frame setting
python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset_multiframe.yaml
# Ignore 'CUDA_ERROR_NO_DEVICE' error as this process does not require GPU.

nuScenes：

1.Prepare the trainval dataset from nuScenes and prepare the directory as follows:

GPICTURE
├── data
│   ├── nuscenes
│   │   │── v1.0-trainval
│   │   │   │── samples
│   │   │   │── sweeps
│   │   │   │── maps
│   │   │   │── v1.0-trainval  
├── pcdet
├── tools

2.Prepare the environment and install nuscenes-devkit：

pip install nuscenes-devkit==1.0.5

3.Generate the complete dataset.

# for lidar-only setting
python -m pcdet.datasets.nuscenes.nuscenes_dataset --func create_nuscenes_infos --cfg_file tools/cfgs/dataset_configs/nuscenes_dataset.yaml --version v1.0-trainval

nuScenes Lidarseg：

1.Download the annotation files from nuScenes and prepare the directory as follows:

GPICTURE
├── data
│   ├── nuscenes
│   │   │── v1.0-trainval
│   │   │   │── samples
│   │   │   │── sweeps
│   │   │   │── maps
│   │   │   │── v1.0-trainval  
│   │   │   │   │── lidarseg.json
│   │   │   │   │── category.json
│   │   │── lidarseg
│   │   │   │── v1.0-trainval  
├── pcdet
├── tools

nuScenes OpenOccupancy：

1.Download the annotation files from OpenOccupancy and prepare the directory as follows:

GPICTURE
├── data
│   ├── nuscenes
│   │   │── v1.0-trainval
│   │   │   │── samples
│   │   │   │── sweeps
│   │   │   │── maps
│   │   │   │── v1.0-trainval  
│   │   │   │   │── lidarseg.json
│   │   │   │   │── category.json
│   │── nuScenes-Occupancy
├── pcdet
├── tools

2.Prepare the environment:

conda install -c omgarcia gcc-6 # gcc-6.2
pip install mmcv-full==1.4.0
pip install mmdet==2.14.0
pip install mmsegmentation==0.14.1

# Install mmdet3d from source code.
git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
git checkout v0.17.1 # Other versions may not be compatible.
python setup.py install

# Install occupancy pooling.
git clone https://github.com/JeffWang987/OpenOccupancy.git
cd OpenOccupancy
export PYTHONPATH=“.”
python setup.py develop

👆 BACK to Table of Contents -->

⚒️3. Prepare the Environment

create environment and install pytorch

conda create --name gpicture python=3.8
conda activate gpicture
# install pytorch
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.1 -c pytorch -c conda-forge
# Verify if pytorch is installed
import torch 
print(torch.cuda.is_available()) # If normal, return "True"
import torch    # If normal, remain silent
a = torch.Tensor([1.])    # If normal, remain silent
a.cuda()    # If normal, return"tensor([ 1.], device='cuda:0')"
from torch.backends import cudnn # If normal, remain silent
cudnn.is_acceptable(a.cuda())    # If normal, return "True"

2.install OpenPCDet

# install spconv
pip install spconv-cu111
# install requirements
pip install -r requirements.txt
# setup
python setup.py develop

3.install other packages

# install other packages
pip install torch_scatter
pip install nuscenes-devkit==1.0.5
pip install open3d

# install the Python package for evaluating the Waymo dataset
pip install waymo-open-dataset-tf-2-5-0==1.4.1

# pay attention to specific package versions.
pip install pandas==1.4.3
pip install matplotlib==3.6.2
pip install scikit-image==0.19.3
pip install async-lru==1.0.3

# install CUDA extensions
cd common_ops
pip install .

4.install MinkowskiEngine

# install MinkowskiEngine
pip install ninja
conda install openblas-devel -c anaconda
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas

👆 BACK to Table of Contents -->

⚒️4. Prepare the Seal Feature for the Entire Dataset Offline

1.Prepare the coords and feats inputs.

cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_waymo_ssl_seal_generate_input.yaml
# or
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_ssl_seal_generate_input.yaml

2.Utilize the MinkUNet (Res16UNet34C) pre-trained by Seal to generate the Seal features.

cd tools/
python prepare_seal_output.py

👆 BACK to Table of Contents -->

:rocket:5. Run the Code

We provide the configuration files in the paper and appendix in tools/cfgs/gpicture_models/.

Pre-training

Waymo

cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_waymo_ssl_seal_decoder_mask.yaml

nuScenes

cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_ssl_seal_decoder_mask.yaml

Fine-tuning

3D Object Detection on Waymo:

cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_waymo_detection.yaml --pretrained_model /path/of/pretrain/model.pth

3D Object Detection on nuScenes:

cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_detection.yaml --pretrained_model /path/of/pretrain/model.pth

3D Semantic Segmentation:

cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_segmentation.yaml --pretrained_model /path/of/pretrain/model.pth

Occupancy Prediction:

cd tools/
python train.py --cfg_file cfgs/gpicture_models/gpicture_nuscenes_occupancy.yaml --pretrained_model /path/of/pretrain/model.pth

👆 BACK to Table of Contents -->