huang-yh / SelfOcc

[CVPR 2024] SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
Apache License 2.0
273 stars 17 forks source link

SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction

Paper | Project Page

SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction, CVPR 2024

Yuanhui Huang, Wenzhao Zheng\ $\dagger$, Borui Zhang, Jie Zhou, Jiwen Lu$\ddagger$

* Equal contribution $\dagger$ Project leader $\ddagger$ Corresponding author

SelfOcc empowers 3D autonomous driving world models (e.g., OccWorld) with scalable 3D representations, paving the way for interpretable end-to-end large driving models.

News

Demo

Trained using only video sequences and poses:

demo

Trained using an additional off-the-shelf 2D segmentor (OpenSeeD):

demo

legend

More demo videos can be downloaded here.

Overview

overview

Results

Getting Started

Installation

Follow detailed instructions in Installation.

Preparing Dataset

Follow detailed instructions in Prepare Dataset.

We also provide our code for synchronizing sweep data according to keyframe samples.

Run

[23/12/16 Update] Please update the timm package to 0.9.2 to run the training script.

3D Occupancy Prediction

Download model weights HERE and put it under out/nuscenes/occ/

# train
python train.py --py-config config/nuscenes/nuscenes_occ.py --work-dir out/nuscenes/occ_train --depth-metric
# eval
python eval_iou.py --py-config config/nuscenes/nuscenes_occ.py --work-dir out/nuscenes/occ --resume-from out/nuscenes/occ/model_state_dict.pth --occ3d --resolution 0.4 --sem --use-mask --scene-size 4

Novel Depth Synthesis

Download model weights HERE and put it under out/nuscenes/novel_depth/

# train
python train.py --py-config config/nuscenes/nuscenes_novel_depth.py --work-dir out/nuscenes/novel_depth_train --depth-metric
# evak
python eval_novel_depth.py --py-config config/nuscenes/nuscenes_novel_depth.py --work-dir out/nuscenes/novel_depth --resume-from out/nuscenes/novel_depth/model_state_dict.pth

Depth Estimation

Download model weights HERE and put it under out/nuscenes/depth/

# train
python train.py --py-config config/nuscenes/nuscenes_depth.py --work-dir out/nuscenes/depth_train --depth-metric
# eval
python eval_depth.py --py-config config/nuscenes/nuscenes_depth.py --work-dir out/nuscenes/depth --resume-from out/nuscenes/depth/model_state_dict.pth --depth-metric --batch 90000

Note that evaluating at a resolution (450*800) of 1:2 against the raw image (900*1600) takes about 90 min, because we batchify rays for rendering due to GPU memory limit. You can change the rendering resolution by the variable NUM_RAYS in utils/config_tools.py

More details on more datasets are detailed in Run and Eval.

Visualization

Follow detailed instructions in Visualization.

Related Projects

Our code is based on TPVFormer and PointOcc.

Also thanks to these excellent open-sourced repos: SurroundOcc OccFormer BEVFormer

Citation

If you find this project helpful, please consider citing the following paper:

@article{huang2023self,
    title={SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction},
    author={Huang, Yuanhui and Zheng, Wenzhao and Zhang, Borui and Zhou, Jie and Lu, Jiwen },
    journal={arXiv preprint arXiv:2311.12754},
    year={2023}
}