ldkong1205 / RoboDepth

[NeurIPS 2023] RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions
https://ldkong.com/RoboDepth
264 stars 33 forks source link
autonomous-driving depth-estimation ood-detection robustness

English | 简体中文

RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions

Lingdong Kong1,2   Shaoyuan Xie3   Hanjiang Hu4   Lai Xing Ng2,5   Benoit R. Cottereau2,6   Wei Tsang Ooi1,2
1National University of Singapore    2CNRS@CREATE    3University of California, Irvine    4Carnegie Mellon University    5Institute for Infocomm Research, A*STAR    6CNRS

About

RoboDepth is a comprehensive evaluation benchmark designed for probing the robustness of monocular depth estimation algorithms. It includes 18 common corruption types, ranging from weather and lighting conditions, sensor failure and movement, and noises during data processing.

Updates

Outline

Installation

Kindly refer to INSTALL.md for the installation details.

Data Preparation

Our datasets are hosted by OpenDataLab.


OpenDataLab is a pioneering open data platform for the large AI model era, making datasets accessible. By using OpenDataLab, researchers can obtain free formatted datasets in various fields.

The RoboDepth Benchmark

Kindly refer to DATA_PREPARE.md for the details to prepare the 1KITTI, 2[KITTI-C](), 3NYUDepth2, 4[NYUDepth2-C](), 5Cityscapes, 6Foggy-Cityscapes, 7nuScenes, and 8[nuScenes-C](), datasets.

Competition @ ICRA 2023

Kindly refer to this page for the details to prepare the training and evaluation data associated with the 1st RoboDepth Competition at the 40th IEEE Conference on Robotics and Automation (ICRA 2023).

Getting Started

Kindly refer to GET_STARTED.md to learn more usage about this codebase.

Model Zoo

:oncoming_automobile: - Outdoor Depth Estimation

 Self-Supervised Depth Estimation > - [x] **[MonoDepth2](https://arxiv.org/abs/1806.01260), ICCV 2019.** [**`[Code]`**](https://github.com/nianticlabs/monodepth2) > - [x] **[DepthHints](https://arxiv.org/abs/1909.09051), ICCV 2019.** [**`[Code]`**](https://github.com/nianticlabs/depth-hints) > - [x] **[MaskOcc](https://arxiv.org/abs/1908.11112), arXiv 2019.** [**`[Code]`**](https://github.com/schelv/monodepth2) > - [x] **[DNet](https://arxiv.org/abs/2004.05560), IROS 2020.** [**`[Code]`**](https://github.com/TJ-IPLab/DNet) > - [ ] **[SGDepth](https://arxiv.org/abs/2007.06936), ECCV 2020.** [**`[Code]`**](https://github.com/ifnspaml/SGDepth) > - [x] **[CADepth](https://arxiv.org/abs/2112.13047), 3DV 2021.** [**`[Code]`**](https://github.com/kamiLight/CADepth-master) > - [ ] **[TC-Depth](https://arxiv.org/abs/2110.08192), 3DV 2021.** [**`[Code]`**](https://github.com/DaoyiG/TC-Depth) > - [x] **[HR-Depth](https://arxiv.org/abs/2012.07356), AAAI 2021.** [**`[Code]`**](https://github.com/shawLyu/HR-Depth) > - [ ] **[Insta-DM](https://arxiv.org/abs/2102.02629), AAAI 2021.** [**`[Code]`**](https://github.com/SeokjuLee/Insta-DM) > - [x] **[DIFFNet](https://arxiv.org/abs/2110.09482), BMVC 2021.** [**`[Code]`**](https://github.com/brandleyzhou/DIFFNet) > - [x] **[ManyDepth](https://arxiv.org/abs/2104.14540), CVPR 2021.** [**`[Code]`**](https://github.com/nianticlabs/manydepth) > - [ ] **[EPCDepth](https://arxiv.org/abs/2109.12484), ICCV 2021.** [**`[Code]`**](https://github.com/prstrive/EPCDepth) > - [x] **[FSRE-Depth](http://arxiv.org/abs/2108.08829), ICCV 2021.** [**`[Code]`**](https://github.com/hyBlue/FSRE-Depth) > - [ ] **[R-MSFM](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhou_R-MSFM_Recurrent_Multi-Scale_Feature_Modulation_for_Monocular_Depth_Estimating_ICCV_2021_paper.pdf), ICCV 2021.** [**`[Code]`**](https://github.com/jsczzzk/R-MSFM) > - [x] **[MonoViT](https://arxiv.org/abs/2208.03543), 3DV 2022.** [**`[Code]`**](https://github.com/zxcqlf/MonoViT) > - [ ] **[DepthFormer](https://arxiv.org/abs/2204.07616), CVPR 2022.** [**`[Code]`**](https://github.com/TRI-ML/vidar) > - [x] **[DynaDepth](https://arxiv.org/abs/2207.04680), ECCV 2022.** [**`[Code]`**](https://github.com/SenZHANG-GitHub/ekf-imu-depth) > - [ ] **[DynamicDepth](https://arxiv.org/abs/2203.15174), ECCV 2022.** [**`[Code]`**](https://github.com/AutoAILab/DynamicDepth) > - [x] **[RA-Depth](https://arxiv.org/abs/2207.11984), ECCV 2022.** [**`[Code]`**](https://github.com/hmhemu/RA-Depth) > - [ ] **[Dyna-DM](https://arxiv.org/abs/2206.03799), arXiv 2022.** [**`[Code]`**](https://github.com/kieran514/dyna-dm) > - [x] **[TriDepth](https://arxiv.org/abs/2210.00411), WACV 2023.** [**`[Code]`**](https://github.com/xingyuuchen/tri-depth) > - [ ] **[FreqAwareDepth](https://arxiv.org/abs/2210.05479), WACV 2023.** [**`[Code]`**](https://github.com/xingyuuchen/freq-aware-depth) > - [x] **[Lite-Mono](https://arxiv.org/abs/2211.13202), CVPR 2023.** [**`[Code]`**](https://github.com/noahzn/Lite-Mono)
 Self-Supervised Multi-View Depth Estimation > - [x] **[MonoDepth2](https://arxiv.org/abs/1806.01260), ICCV 2019.** [**`[Code]`**](https://github.com/nianticlabs/monodepth2) > - [x] **[SurroundDepth](https://arxiv.org/abs/2204.03636), CoRL 2022.** [**`[Code]`**](https://github.com/weiyithu/SurroundDepth)
 Fully-Supervised Depth Estimation > - [ ] **[AdaBins](https://arxiv.org/abs/2011.14141), CVPR 2021.** [**`[Code]`**](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/adabins) > - [ ] **[NeWCRFs](https://arxiv.org/abs/2203.01502), CVPR 2022.** [**`[Code]`**](https://github.com/aliyun/NeWCRFs) > - [ ] **[DepthFormer](https://arxiv.org/abs/2203.14211), arXiv 2022.** [**`[Code]`**](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/depthformer) > - [ ] **[GLPDepth](https://arxiv.org/abs/2201.07436), arXiv 2022.** [**`[Code]`**](https://github.com/vinvino02/GLPDepth)
 Semi-Supervised Depth Estimation > - [ ] **[MaskingDepth](https://arxiv.org/abs/2212.10806), arXiv 2022.** [**`[Code]`**](https://github.com/KU-CVLAB/MaskingDepth)

:house: - Indoor Depth Estimation

 Self-Supervised Depth Estimation > - [ ] **[P2Net](https://arxiv.org/abs/2007.07696), ECCV 2020.** [**`[Code]`**](https://github.com/svip-lab/Indoor-SfMLearner) > - [ ] **[EPCDepth](https://arxiv.org/abs/2109.12484), ICCV 2021.** [**`[Code]`**](https://github.com/prstrive/EPCDepth)
 Fully-Supervised Depth Estimation > - [x] **[BTS](https://arxiv.org/abs/1907.10326), arXiv 2019.** [**`[Code]`**](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/bts) > - [x] **[AdaBins](https://arxiv.org/abs/2011.14141), CVPR 2021.** [**`[Code]`**](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/adabins) > - [x] **[DPT](https://arxiv.org/abs/2103.13413), ICCV 2021.** [**`[Code]`**](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/dpt) > - [x] **[SimIPU](https://arxiv.org/abs/2112.04680), AAAI 2022.** [**`[Code]`**](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/simipu) > - [ ] **[NeWCRFs](https://arxiv.org/abs/2203.01502), CVPR 2022.** [**`[Code]`**](https://github.com/aliyun/NeWCRFs) > - [ ] **[P3Depth](https://arxiv.org/abs/2204.02091), CVPR 2022.** [**`[Code]`**](https://github.com/SysCV/P3Depth) > - [x] **[DepthFormer](https://arxiv.org/abs/2203.14211), arXiv 2022.** [**`[Code]`**](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/depthformer) > - [ ] **[GLPDepth](https://arxiv.org/abs/2201.07436), arXiv 2022.** [**`[Code]`**](https://github.com/vinvino02/GLPDepth) > - [x] **[BinsFormer](https://arxiv.org/abs/2204.00987), arXiv 2022.** [**`[Code]`**](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/binsformer)
 Semi-Supervised Depth Estimation > - [ ] **[MaskingDepth](https://arxiv.org/abs/2212.10806), arXiv 2022.** [**`[Code]`**](https://github.com/KU-CVLAB/MaskingDepth)

Benchmark

:bar_chart: Metrics: The following metrics are consistently used in our benchmark:

:gear: Notation: Symbol :star: denotes the baseline model adopted in mCE calculation.

KITTI-C

Model Modality mCE (%) mRR (%) Clean Bright Dark Fog Frost Snow Contrast Defocus Glass Motion Zoom Elastic Quant Gaussian Impulse Shot ISO Pixelate JPEG
[MonoDepth2R18]():star: Mono 100.00 84.46 0.119 0.130 0.280 0.155 0.277 0.511 0.187 0.244 0.242 0.216 0.201 0.129 0.193 0.384 0.389 0.340 0.388 0.145 0.196
[MonoDepth2R18+nopt]() Mono 119.75 82.50 0.144 0.183 0.343 0.311 0.312 0.399 0.416 0.254 0.232 0.199 0.207 0.148 0.212 0.441 0.452 0.402 0.453 0.153 0.171
[MonoDepth2R18+HR]() Mono 106.06 82.44 0.114 0.129 0.376 0.155 0.271 0.582 0.214 0.393 0.257 0.230 0.232 0.123 0.215 0.326 0.352 0.317 0.344 0.138 0.198
[MonoDepth2R50]() Mono 113.43 80.59 0.117 0.127 0.294 0.155 0.287 0.492 0.233 0.427 0.392 0.277 0.208 0.130 0.198 0.409 0.403 0.368 0.425 0.155 0.211
[MaskOcc]() Mono 104.05 82.97 0.117 0.130 0.285 0.154 0.283 0.492 0.200 0.318 0.295 0.228 0.201 0.129 0.184 0.403 0.410 0.364 0.417 0.143 0.177
[DNetR18]() Mono 104.71 83.34 0.118 0.128 0.264 0.156 0.317 0.504 0.209 0.348 0.320 0.242 0.215 0.131 0.189 0.362 0.366 0.326 0.357 0.145 0.190
[CADepth]() Mono 110.11 80.07 0.108 0.121 0.300 0.142 0.324 0.529 0.193 0.356 0.347 0.285 0.208 0.121 0.192 0.423 0.433 0.383 0.448 0.144 0.195
[HR-Depth]() Mono 103.73 82.93 0.112 0.121 0.289 0.151 0.279 0.481 0.213 0.356 0.300 0.263 0.224 0.124 0.187 0.363 0.373 0.336 0.374 0.135 0.176
[DIFFNetHRNet]() Mono 94.96 85.41 0.102 0.111 0.222 0.131 0.199 0.352 0.161 0.513 0.330 0.280 0.197 0.114 0.165 0.292 0.266 0.255 0.270 0.135 0.202
[ManyDepthsingle]() Mono 105.41 83.11 0.123 0.135 0.274 0.169 0.288 0.479 0.227 0.254 0.279 0.211 0.194 0.134 0.189 0.430 0.450 0.387 0.452 0.147 0.182
[FSRE-Depth]() Mono 99.05 83.86 0.109 0.128 0.261 0.139 0.237 0.393 0.170 0.291 0.273 0.214 0.185 0.119 0.179 0.400 0.414 0.370 0.407 0.147 0.224
[MonoViTMPViT]() Mono 79.33 89.15 0.099 0.106 0.243 0.116 0.213 0.275 0.119 0.180 0.204 0.163 0.179 0.118 0.146 0.310 0.293 0.271 0.290 0.162 0.154
[MonoViTMPViT+HR]() Mono 74.95 89.72 0.094 0.102 0.238 0.114 0.225 0.269 0.117 0.145 0.171 0.145 0.184 0.108 0.145 0.302 0.277 0.259 0.285 0.135 0.148
[DynaDepthR18]() Mono 110.38 81.50 0.117 0.128 0.289 0.156 0.289 0.509 0.208 0.501 0.347 0.305 0.207 0.127 0.186 0.379 0.379 0.336 0.379 0.141 0.180
[DynaDepthR50]() Mono 119.99 77.98 0.113 0.128 0.298 0.152 0.324 0.549 0.201 0.532 0.454 0.318 0.218 0.125 0.197 0.418 0.437 0.382 0.448 0.153 0.216
[RA-DepthHRNet]() Mono 112.73 78.79 0.096 0.113 0.314 0.127 0.239 0.413 0.165 0.499 0.368 0.378 0.214 0.122 0.178 0.423 0.403 0.402 0.455 0.175 0.192
[TriDepthsingle]() Mono 109.26 81.56 0.117 0.131 0.300 0.188 0.338 0.498 0.265 0.268 0.301 0.212 0.190 0.126 0.199 0.418 0.438 0.380 0.438 0.142 0.205
[Lite-MonoTiny]() Mono 92.92 86.69 0.115 0.127 0.257 0.157 0.225 0.354 0.191 0.257 0.248 0.198 0.186 0.127 0.159 0.358 0.342 0.336 0.360 0.147 0.161
[Lite-MonoTiny+HR]() Mono 86.71 87.63 0.106 0.119 0.227 0.139 0.282 0.370 0.166 0.216 0.201 0.190 0.202 0.116 0.146 0.320 0.291 0.286 0.312 0.148 0.167
[Lite-MonoSmall]() Mono 100.34 84.67 0.115 0.127 0.251 0.162 0.251 0.430 0.238 0.353 0.282 0.246 0.204 0.128 0.161 0.350 0.336 0.319 0.356 0.154 0.164
[Lite-MonoSmall+HR]() Mono 89.90 86.05 0.105 0.119 0.263 0.139 0.263 0.436 0.167 0.188 0.181 0.193 0.214 0.117 0.147 0.366 0.354 0.327 0.355 0.152 0.157
[Lite-MonoBase]() Mono 93.16 85.99 0.110 0.119 0.259 0.144 0.245 0.384 0.177 0.224 0.237 0.221 0.196 0.129 0.175 0.361 0.340 0.334 0.363 0.151 0.165
[Lite-MonoBase+HR]() Mono 89.85 85.80 0.103 0.115 0.256 0.135 0.258 0.486 0.164 0.220 0.194 0.213 0.205 0.114 0.154 0.340 0.327 0.321 0.344 0.145 0.156
[Lite-MonoLarge]() Mono 90.75 85.54 0.102 0.110 0.227 0.126 0.255 0.433 0.149 0.222 0.225 0.220 0.192 0.121 0.148 0.363 0.348 0.329 0.362 0.160 0.184
[Lite-MonoLarge+HR]() Mono 92.01 83.90 0.096 0.112 0.241 0.122 0.280 0.482 0.141 0.193 0.194 0.213 0.222 0.108 0.140 0.403 0.404 0.365 0.407 0.139 0.182
[MonoDepth2R18]() Stereo 117.69 79.05 0.123 0.133 0.348 0.161 0.305 0.515 0.234 0.390 0.332 0.264 0.209 0.135 0.200 0.492 0.509 0.463 0.493 0.144 0.194
[MonoDepth2R18+nopt]() Stereo 128.98 79.20 0.150 0.181 0.422 0.292 0.352 0.435 0.342 0.266 0.232 0.217 0.229 0.156 0.236 0.539 0.564 0.521 0.556 0.164 0.178
[MonoDepth2R18+HR]() Stereo 111.46 81.65 0.117 0.132 0.285 0.167 0.356 0.529 0.238 0.432 0.312 0.279 0.246 0.130 0.206 0.343 0.343 0.322 0.344 0.150 0.209
[DepthHints]() Stereo 111.41 80.08 0.113 0.124 0.310 0.137 0.321 0.515 0.164 0.350 0.410 0.263 0.196 0.130 0.192 0.440 0.447 0.412 0.455 0.157 0.192
[DepthHintsHR]() Stereo 112.02 79.53 0.104 0.122 0.282 0.141 0.317 0.480 0.180 0.459 0.363 0.320 0.262 0.118 0.183 0.397 0.421 0.380 0.424 0.141 0.183
[DepthHintsHR+nopt]() Stereo 141.61 73.18 0.134 0.173 0.476 0.301 0.374 0.463 0.393 0.357 0.289 0.241 0.231 0.142 0.247 0.613 0.658 0.599 0.692 0.152 0.191
[MonoDepth2R18]() M+S 124.31 75.36 0.116 0.127 0.404 0.150 0.295 0.536 0.199 0.447 0.346 0.283 0.204 0.128 0.203 0.577 0.605 0.561 0.629 0.136 0.179
[MonoDepth2R18+nopt]() M+S 136.25 76.72 0.146 0.193 0.460 0.328 0.421 0.428 0.440 0.228 0.221 0.216 0.230 0.153 0.229 0.570 0.596 0.549 0.606 0.161 0.177
[MonoDepth2R18+HR]() M+S 106.06 82.44 0.114 0.129 0.376 0.155 0.271 0.582 0.214 0.393 0.257 0.230 0.232 0.123 0.215 0.326 0.352 0.317 0.344 0.138 0.198
[CADepth]() M+S 118.29 76.68 0.110 0.123 0.357 0.137 0.311 0.556 0.169 0.338 0.412 0.260 0.193 0.126 0.186 0.546 0.559 0.524 0.582 0.145 0.192
[MonoViTMPViT]() M+S 75.39 90.39 0.098 0.104 0.245 0.122 0.213 0.215 0.131 0.179 0.184 0.161 0.168 0.112 0.147 0.277 0.257 0.242 0.260 0.147 0.144
[MonoViTMPViT+HR]() M+S 70.79 90.67 0.090 0.097 0.221 0.113 0.217 0.253 0.113 0.146 0.159 0.144 0.175 0.098 0.138 0.267 0.246 0.236 0.246 0.135 0.145

NYUDepth2-C

Model mCE (%) mRR (%) Clean Bright Dark Contrast Defocus Glass Motion Zoom Elastic Quant Gaussian Impulse Shot ISO Pixelate JPEG
[BTSR50]() 122.78 80.63 0.122 0.149 0.269 0.265 0.337 0.262 0.231 0.372 0.182 0.180 0.442 0.512 0.392 0.474 0.139 0.175
[AdaBinsR50]() 134.69 81.62 0.158 0.179 0.293 0.289 0.339 0.280 0.245 0.390 0.204 0.216 0.458 0.519 0.401 0.481 0.186 0.211
[AdaBinsEfficientB5]():star: 100.00 85.83 0.112 0.132 0.194 0.212 0.235 0.206 0.184 0.384 0.153 0.151 0.390 0.374 0.294 0.380 0.124 0.154
[DPTViT-B]() 83.22 95.25 0.136 0.135 0.182 0.180 0.154 0.166 0.155 0.232 0.139 0.165 0.200 0.213 0.191 0.199 0.171 0.174
[SimIPUR50+no_pt]() 200.17 92.52 0.372 0.388 0.427 0.448 0.416 0.401 0.400 0.433 0.381 0.391 0.465 0.471 0.450 0.461 0.375 0.378
[SimIPUR50+imagenet]() 163.06 85.01 0.244 0.269 0.370 0.376 0.377 0.337 0.324 0.422 0.306 0.289 0.445 0.463 0.414 0.449 0.247 0.272
[SimIPUR50+kitti]() 173.78 91.64 0.312 0.326 0.373 0.406 0.360 0.333 0.335 0.386 0.316 0.333 0.432 0.442 0.422 0.443 0.314 0.322
[SimIPUR50+waymo]() 159.46 85.73 0.243 0.269 0.348 0.398 0.380 0.327 0.313 0.405 0.256 0.287 0.439 0.461 0.416 0.455 0.246 0.265
[DepthFormerSwinT_w7_1k]() 106.34 87.25 0.125 0.147 0.279 0.235 0.220 0.260 0.191 0.300 0.175 0.192 0.294 0.321 0.289 0.305 0.161 0.179
[DepthFormerSwinT_w7_22k]() 63.47 94.19 0.086 0.099 0.150 0.123 0.127 0.172 0.119 0.237 0.112 0.119 0.159 0.156 0.148 0.157 0.101 0.108

Idiosyncrasy Analysis

For more detailed benchmarking results and to access the pretrained weights used in robustness evaluation, kindly refer to RESULT.md.

Create Corruption Sets

You can manage to create your own "RoboDepth" corruption sets! Follow the instructions listed in CREATE.md.

TODO List

Citation

If you find this work helpful, please kindly consider citing our papers:

@inproceedings{kong2023robodepth,
  title = {RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions},
  author = {Kong, Lingdong and Xie, Shaoyuan and Hu, Hanjiang and Ng, Lai Xing and Cottereau, Benoit R. and Ooi, Wei Tsang},
  booktitle = {Advances in Neural Information Processing Systems},
  year = {2023},
}
@article{kong2023robodepth_challenge,
  title = {The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation},
  author = {Kong, Lingdong and Niu, Yaru and Xie, Shaoyuan and Hu, Hanjiang and Ng, Lai Xing and Cottereau, Benoit and Zhao, Ding and Zhang, Liangjun and Wang, Hesheng and Ooi, Wei Tsang and Zhu, Ruijie and Song, Ziyang and Liu, Li and Zhang, Tianzhu and Yu, Jun and Jing, Mohan and Li, Pengwei and Qi, Xiaohua and Jin, Cheng and Chen, Yingfeng and Hou, Jie and Zhang, Jie and Kan, Zhen and Lin, Qiang and Peng, Liang and Li, Minglei and Xu, Di and Yang, Changpeng and Yao, Yuanqi and Wu, Gang and Kuai, Jian and Liu, Xianming and Jiang, Junjun and Huang, Jiamian and Li, Baojun and Chen, Jiale and Zhang, Shuang and Ao, Sun and Li, Zhenyu and Chen, Runze and Luo, Haiyong and Zhao, Fang and Yu, Jingze},
  journal = {arXiv preprint arXiv:2307.15061}, 
  year = {2023},
}
@misc{kong2023robodepth_benchmark,
  title = {The RoboDepth Benchmark for Robust Out-of-Distribution Depth Estimation under Corruptions},
  author = {Kong, Lingdong and Xie, Shaoyuan and Hu, Hanjiang and Cottereau, Benoit and Ng, Lai Xing and Ooi, Wei Tsang},
  howpublished = {\url{https://github.com/ldkong1205/RoboDepth}}, 
  year = {2023},
}

License

Creative Commons License
This work is under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Sponsor

We thank Baidu Research for the support towards the RoboDepth Challenge.


Acknowledgements

This project is supported by DesCartes, a CNRS@CREATE program on Intelligent Modeling for Decision-Making in Critical Urban Systems.