Jiawei-Yao0812 / NDCScene

Official implementation for the ICCV 2023 paper "NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space"
31 stars 3 forks source link

NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space

Official PyTorch implementation for the ICCV 2023 paper.

NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space\ Jiawei Yao, [Chuming Li](https://scholar.google.com.sg/citations?user=ZfB7vEcAAAAJ&hl=en), Keqiang Sun*, Yingjie Cai, Hao Li, Wanli Ouyang, Hongsheng Li

* equal contribution

Project page arXiv



NYUv2 SemanticKITTI


  1. Create conda environment:
$ conda create -y -n ndcscene python=3.7
$ conda activate ndcscene
  1. This code is compatible with python 3.7, pytorch 1.7.1 and CUDA 10.2. Please install PyTorch:
$ conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.2 -c pytorch
  1. Install the additional dependencies:
$ cd NDCScene/
$ pip install -r requirements.txt
  1. Install tbb:
$ conda install -c bioconda tbb=2020.2
  1. Downgrade torchmetrics to 0.6.0

    $ pip install torchmetrics==0.6.0
  2. Finally, install NDCScene:

$ pip install -e ./



  1. You need to download

    • The Semantic Scene Completion dataset v1.1 (SemanticKITTI voxel data (700 MB)) from SemanticKITTI website
    • The KITTI Odometry Benchmark calibration data (Download odometry data set (calibration files, 1 MB)) and the RGB images (Download odometry data set (color, 65 GB)) from KITTI Odometry website.
  2. Create a folder to store SemanticKITTI preprocess data at /path/to/kitti/preprocess/folder.

  3. Store paths in environment variables for faster access (Note: folder 'dataset' is in /path/to/semantic_kitti):

$ export KITTI_PREPROCESS=/path/to/kitti/preprocess/folder
$ export KITTI_ROOT=/path/to/semantic_kitti 
  1. Preprocess the data to generate labels at a lower scale, which are used to compute the ground truth relation matrices:
$ cd NDCScene/
$ python ndcscene/data/semantic_kitti/preprocess.py kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS


  1. Download the NYUv2 dataset.

  2. Create a folder to store NYUv2 preprocess data at /path/to/NYU/preprocess/folder.

  3. Store paths in environment variables for faster access:

$ export NYU_PREPROCESS=/path/to/NYU/preprocess/folder
$ export NYU_ROOT=/path/to/NYU/depthbin 
  1. Preprocess the data to generate labels at a lower scale, which are used to compute the ground truth relation matrices:
$ cd NDCScene/
$ python ndcscene/data/NYU/preprocess.py NYU_root=$NYU_ROOT NYU_preprocess_root=$NYU_PREPROCESS

Pretrained models

Download NDCScene pretrained models on NYUv2 & SemanticKITTI.


To train NDCScene with SemanticKITTI, type:


  1. Create folders to store training logs at /path/to/kitti/logdir.

  2. Store in an environment variable:

$ export KITTI_LOG=/path/to/kitti/logdir
  1. Train NDCScene using 4 GPUs with batch_size of 4 (1 item per GPU) on Semantic KITTI:
$ cd NDCScene/
$ python ndcscene/scripts/train_ndcscene.py \
    dataset=kitti \
    enable_log=true \
    kitti_root=$KITTI_ROOT \
    kitti_logdir=$KITTI_LOG \
    n_gpus=4 batch_size=4    


  1. Create folders to store training logs at /path/to/NYU/logdir.

  2. Store in an environment variable:

$ export NYU_LOG=/path/to/NYU/logdir
  1. Train NDCScene using 2 GPUs with batch_size of 4 (2 item per GPU) on NYUv2:
    $ cd NDCScene/
    $ python ndcscene/scripts/train_ndcscene.py \
    dataset=NYU \
    NYU_root=$NYU_ROOT \
    NYU_preprocess_root=$NYU_PREPROCESS \
    logdir=$NYU_LOG \
    n_gpus=2 batch_size=4

## Evaluating 

### SemanticKITTI

To evaluate NDCScene on SemanticKITTI validation set, type:

$ cd NDCScene/ $ python ndcscene/scripts/eval_ndcscene.py \ dataset=kitti \ kitti_root=$KITTI_ROOT \ kitti_preprocess_root=$KITTI_PREPROCESS \ n_gpus=1 batch_size=1

### NYUv2

To evaluate NDCScene on NYUv2 test set, type:

$ cd NDCScene/ $ python ndcscene/scripts/eval_ndcscene.py \ dataset=NYU \ NYU_root=$NYU_ROOT\ NYU_preprocess_root=$NYU_PREPROCESS \ n_gpus=1 batch_size=1

## Inference

Please create folder **/path/to/ndcscene/output** to store the NDCScene outputs and store in environment variable:

export MONOSCENE_OUTPUT=/path/to/ndcscene/output

### NYUv2

To generate the predictions on the NYUv2 test set, type:

$ cd NDCScene/ $ python ndcscene/scripts/generate_output.py \ +output_path=$MONOSCENE_OUTPUT \ dataset=NYU \ NYU_root=$NYU_ROOT \ NYU_preprocess_root=$NYU_PREPROCESS \ n_gpus=1 batch_size=1

### Semantic KITTI

To generate the predictions on the Semantic KITTI validation set, type:

$ cd NDCScene/ $ python ndcscene/scripts/generate_output.py \ +output_path=$MONOSCENE_OUTPUT \ dataset=kitti \ kitti_root=$KITTI_ROOT \ kitti_preprocess_root=$KITTI_PREPROCESS \ n_gpus=1 batch_size=1

## Visualization

We use mayavi to visualize the predictions. Please install mayavi following the [official installation instruction](https://docs.enthought.com/mayavi/mayavi/installation.html). Then, use the following commands to visualize the outputs on respective datasets.

You also need to install some packages used by the visualization scripts using the commands:

pip install tqdm pip install omegaconf pip install hydra-core

### NYUv2 

$ cd NDCScene/ $ python ndcscene/scripts/visualization/NYU_vis_pred.py +file=/path/to/output/file.pkl

### Semantic KITTI 

$ cd NDCScene/ $ python ndcscene/scripts/visualization/kitti_vis_pred.py +file=/path/to/output/file.pkl +dataset=kitt

## Citation

@InProceedings{Yao_2023_ICCV, author = {Yao, Jiawei and Li, Chuming and Sun, Keqiang and Cai, Yingjie and Li, Hao and Ouyang, Wanli and Li, Hongsheng}, title = {NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {9455-9465}


## Acknowledgement
This project is built based on MonoScene. We thank the contributors of the prior project for building such excellent codebase and repo. Please refer to this repo (https://github.com/astra-vision/MonoScene) for more documentations and details.