PyTorch implementation of our ICCV2021 paper:
StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation
Boying Li, Yuan Huang, Zeyu Liu, Danping Zou, Wenxian Yu
(* Equal Contribution) Please consider citing our paper in your publications if the project helps your research.
@inproceedings{structdepth,
title={StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation},
author={Li, Boying and Huang, Yuan and Liu, Zeyu and Zou, Danping and Yu, Wenxian},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
year={2021}
}
The Python and PyTorch versions we use:
python=3.6
pytorch=1.7.1=py3.6_cuda10.1.243_cudnn7.6.3_0
Step1: Creating a virtual environment
conda create -n struct_depth python=3.6
conda activate struct_depth
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch
Step2: Download the modified scikit_image package , in which the input parameters of the Felzenswalb algorithm have been changed to accommodate our method.
unzip scikit-image-0.17.2.zip
cd scikit-image-0.17.2
python setup.py build_ext -i
pip install -e .
Step3: Installing other packages
pip install -r requirements.txt
Please download pretrained models and unzip them to MODEL_PATH
python inference_single_image.py --image_path=/path/to/image --load_weights_folder=MODEL_PATH
Please download test dataset
It is recommended to unpack all test data and training data into the same data path and then modify the DATA_PATH when running a training or evaluation script.
Modify the evaluation script in eval.sh to evaluate NYUv2/InteriorNet/ScanNet depth and norm separately
python evaluation/nyuv2_eval_norm.py \
--data_path DATA_PATH \
--load_weights_folder MODEL_PATH \
The raw NYU dataset is about 400G and has 590 videos. You can download the raw datasets from there
python extract_vps_nyu.py --data_path DATA_PATH --output_dir VPS_PATH --failed_list TMP_LIST -- thresh 60
If you need to train with a random flip, run the main direction extraction script on the images before and after the flip(add --flip) in advance, and note the failure examples, which can be skipped by referring to the code in datasets/nyu_datases.py.
Modify the training script train.sh for PATH or different trainning settings.
python train.py \
--data_path DATA_PATH \
--val_path DATA_PATH \
--train_split ./splits/nyu_train_0_10_20_30_40_21483-exceptfailed-21465.txt \
--vps_path VPS_PATH \
--log_dir LOG_PATH \
--model_name 1 \
--batch_size 32 \
--num_epochs 50 \
--start_epoch 0 \
--using_disp2seg \
--using_normloss \
--load_weights_folder PRETRAIN_MODEL_PATH \
--lambda_planar_reg 0.1 \
--lambda_norm_reg 0.05 \
--planar_thresh 200 \
We borrowed a lot of codes from scikit-image, monodepth2, P2Net, and LEGO. Thanks for their excellent works!