IEBins: Iterative Elastic Bins for Monocular Depth Estimation

Shuwei Shao¹ Zhongcai Pei¹ Xingming Wu¹ Zhong Liu¹ Weihai Chen² Zhengguo Li³

¹Beihang University, ²Anhui University, ³A*STAR

• NeurIPS 2023 •

[![KITTI Benchmark](https://img.shields.io/badge/KITTI%20Benchmark-2nd%20among%20all%20at%20submission%20time-blue)](https://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_prediction) [![Hugging Space Badge](https://img.shields.io/badge/🤗-Open%20In%20Spaces-blue.svg)](https://huggingface.co/spaces/umuthopeyildirim/IEBins-Depth-Estimation) ## Abstract

We propose a novel concept of iterative elastic bins for the classification-regression-based MDE. The proposed IEBins aims to search for high-quality depth by progressively optimizing the search range, which involves multiple stages and each stage performs a finer-grained depth search in the target bin on top of its previous stage. To alleviate the possible error accumulation during the iterative process, we utilize a novel elastic target bin to replace the original target bin, the width of which is adjusted elastically based on the depth uncertainty.

Installation

Datasets

You can prepare the datasets KITTI and NYUv2 according to here and download the SUN RGB-D dataset from here, and then modify the data path in the config files to your dataset locations.

Training

First download the pretrained encoder backbone from here, and then modify the pretrain path in the config files. If you want to train the KITTI_Official model, first download the pretrained encoder backbone from here, which is provided by MIM.

Evaluation

Qualitative Depth and Point Cloud Results

You can download the qualitative depth results of IEBins, NDDepth, NeWCRFs, PixelFormer, AdaBins and BTS on the test sets of NYUv2 and KITTI_Eigen from here and download the qualitative point cloud results of IEBins, NDDepth, NeWCRFS, PixelFormer, AdaBins and BTS on the NYUv2 test set from here.

Gradio Demo

Models

Citation

Contact

Acknowledgement

Model	Abs Rel	Sq Rel	RMSE	a1	a2	a3	Link
NYUv2 (Swin-L)	0.087	0.040	0.314	0.936	0.992	0.998	[Google] [Baidu]
NYUv2 (Swin-T)	0.108	0.061	0.375	0.893	0.984	0.996	[Google] [Baidu]
KITTI_Eigen (Swin-L)	0.050	0.142	2.011	0.978	0.998	0.999	[Google] [Baidu]
KITTI_Eigen (Swin-T)	0.056	0.169	2.205	0.970	0.996	0.999	[Google] [Baidu]

Model	SILog	Abs Rel	Sq Rel	RMSE	a1	a2	a3	Link
KITTI_Official (Swinv2-L)	7.48	5.20	0.79	2.34	0.974	0.996	0.999	[Google]

Our code is based on the implementation of NeWCRFs and BTS. We thank their excellent works.

ShuweiShao / IEBins

readme