English | 简体中文
Lingdong Kong1,2
Shaoyuan Xie3
Hanjiang Hu4
Lai Xing Ng2,5
Benoit R. Cottereau2,6
Wei Tsang Ooi1,2
1National University of Singapore
2CNRS@CREATE
3University of California, Irvine
4Carnegie Mellon University
5Institute for Infocomm Research, A*STAR
6CNRS
RoboDepth is a comprehensive evaluation benchmark designed for probing the robustness of monocular depth estimation algorithms. It includes 18 common corruption types, ranging from weather and lighting conditions, sensor failure and movement, and noises during data processing.
nuScenes
, nuScenes-Night
, Cityscapes
, and Foggy-Cityscapes
. See here for more details.nuScenes-C
benchmark for robust multi-view depth estimation. See here for more details.226
teams registered at CodaLab, 66
of which made a total number of 1137
valid submissions. More details are included in these slides. We thank the exceptional support from our participants! :heart:OpenSpaceAI
, :2nd_place_medal: USTC-IAT-United
, :3rd_place_medal: YYQ
.USTCxNetEaseFuxi
, :2nd_place_medal: OpenSpaceAI
, :3rd_place_medal: GANCV
.Scent-Depth
, :medal_military: Ensemble
, :medal_military: AIIA-RDepth
.NYUDepth2-C
dataset is ready to be downloaded! See here for more details.KITTI-C
dataset is ready to be downloaded! See here for more details.Kindly refer to INSTALL.md for the installation details.
Our datasets are hosted by OpenDataLab.
OpenDataLab is a pioneering open data platform for the large AI model era, making datasets accessible. By using OpenDataLab, researchers can obtain free formatted datasets in various fields.
Kindly refer to DATA_PREPARE.md for the details to prepare the 1KITTI, 2[KITTI-C](), 3NYUDepth2, 4[NYUDepth2-C](), 5Cityscapes, 6Foggy-Cityscapes, 7nuScenes, and 8[nuScenes-C](), datasets.
Kindly refer to this page for the details to prepare the training and evaluation data associated with the 1st RoboDepth Competition at the 40th IEEE Conference on Robotics and Automation (ICRA 2023).
Kindly refer to GET_STARTED.md to learn more usage about this codebase.
:bar_chart: Metrics: The following metrics are consistently used in our benchmark:
Absolute Relative Difference (the lower the better): $\text{Abs Rel} = \frac{1}{|D|}\sum_{pred\in D}\frac{|gt - pred|}{gt}$ .
Accuracy (the higher the better): $\delta_t = \frac{1}{|D|}|{\ pred\in D | \max{(\frac{gt}{pred}, \frac{pred}{gt})< 1.25^t}}| \times 100\%$ .
Depth Estimation Error (the lower the better):
The second Depth Estimation Error term ($\text{DEE}_2$) is adopted as the main indicator for evaluating model performance in our RoboDepth benchmark. The following two metrics are adopted to compare between models' robustness:
:gear: Notation: Symbol :star: denotes the baseline model adopted in mCE calculation.
Model | Modality | mCE (%) | mRR (%) | Clean | Bright | Dark | Fog | Frost | Snow | Contrast | Defocus | Glass | Motion | Zoom | Elastic | Quant | Gaussian | Impulse | Shot | ISO | Pixelate | JPEG |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
[MonoDepth2R18]():star: | Mono | 100.00 | 84.46 | 0.119 | 0.130 | 0.280 | 0.155 | 0.277 | 0.511 | 0.187 | 0.244 | 0.242 | 0.216 | 0.201 | 0.129 | 0.193 | 0.384 | 0.389 | 0.340 | 0.388 | 0.145 | 0.196 |
[MonoDepth2R18+nopt]() | Mono | 119.75 | 82.50 | 0.144 | 0.183 | 0.343 | 0.311 | 0.312 | 0.399 | 0.416 | 0.254 | 0.232 | 0.199 | 0.207 | 0.148 | 0.212 | 0.441 | 0.452 | 0.402 | 0.453 | 0.153 | 0.171 |
[MonoDepth2R18+HR]() | Mono | 106.06 | 82.44 | 0.114 | 0.129 | 0.376 | 0.155 | 0.271 | 0.582 | 0.214 | 0.393 | 0.257 | 0.230 | 0.232 | 0.123 | 0.215 | 0.326 | 0.352 | 0.317 | 0.344 | 0.138 | 0.198 |
[MonoDepth2R50]() | Mono | 113.43 | 80.59 | 0.117 | 0.127 | 0.294 | 0.155 | 0.287 | 0.492 | 0.233 | 0.427 | 0.392 | 0.277 | 0.208 | 0.130 | 0.198 | 0.409 | 0.403 | 0.368 | 0.425 | 0.155 | 0.211 |
[MaskOcc]() | Mono | 104.05 | 82.97 | 0.117 | 0.130 | 0.285 | 0.154 | 0.283 | 0.492 | 0.200 | 0.318 | 0.295 | 0.228 | 0.201 | 0.129 | 0.184 | 0.403 | 0.410 | 0.364 | 0.417 | 0.143 | 0.177 |
[DNetR18]() | Mono | 104.71 | 83.34 | 0.118 | 0.128 | 0.264 | 0.156 | 0.317 | 0.504 | 0.209 | 0.348 | 0.320 | 0.242 | 0.215 | 0.131 | 0.189 | 0.362 | 0.366 | 0.326 | 0.357 | 0.145 | 0.190 |
[CADepth]() | Mono | 110.11 | 80.07 | 0.108 | 0.121 | 0.300 | 0.142 | 0.324 | 0.529 | 0.193 | 0.356 | 0.347 | 0.285 | 0.208 | 0.121 | 0.192 | 0.423 | 0.433 | 0.383 | 0.448 | 0.144 | 0.195 |
[HR-Depth]() | Mono | 103.73 | 82.93 | 0.112 | 0.121 | 0.289 | 0.151 | 0.279 | 0.481 | 0.213 | 0.356 | 0.300 | 0.263 | 0.224 | 0.124 | 0.187 | 0.363 | 0.373 | 0.336 | 0.374 | 0.135 | 0.176 |
[DIFFNetHRNet]() | Mono | 94.96 | 85.41 | 0.102 | 0.111 | 0.222 | 0.131 | 0.199 | 0.352 | 0.161 | 0.513 | 0.330 | 0.280 | 0.197 | 0.114 | 0.165 | 0.292 | 0.266 | 0.255 | 0.270 | 0.135 | 0.202 |
[ManyDepthsingle]() | Mono | 105.41 | 83.11 | 0.123 | 0.135 | 0.274 | 0.169 | 0.288 | 0.479 | 0.227 | 0.254 | 0.279 | 0.211 | 0.194 | 0.134 | 0.189 | 0.430 | 0.450 | 0.387 | 0.452 | 0.147 | 0.182 |
[FSRE-Depth]() | Mono | 99.05 | 83.86 | 0.109 | 0.128 | 0.261 | 0.139 | 0.237 | 0.393 | 0.170 | 0.291 | 0.273 | 0.214 | 0.185 | 0.119 | 0.179 | 0.400 | 0.414 | 0.370 | 0.407 | 0.147 | 0.224 |
[MonoViTMPViT]() | Mono | 79.33 | 89.15 | 0.099 | 0.106 | 0.243 | 0.116 | 0.213 | 0.275 | 0.119 | 0.180 | 0.204 | 0.163 | 0.179 | 0.118 | 0.146 | 0.310 | 0.293 | 0.271 | 0.290 | 0.162 | 0.154 |
[MonoViTMPViT+HR]() | Mono | 74.95 | 89.72 | 0.094 | 0.102 | 0.238 | 0.114 | 0.225 | 0.269 | 0.117 | 0.145 | 0.171 | 0.145 | 0.184 | 0.108 | 0.145 | 0.302 | 0.277 | 0.259 | 0.285 | 0.135 | 0.148 |
[DynaDepthR18]() | Mono | 110.38 | 81.50 | 0.117 | 0.128 | 0.289 | 0.156 | 0.289 | 0.509 | 0.208 | 0.501 | 0.347 | 0.305 | 0.207 | 0.127 | 0.186 | 0.379 | 0.379 | 0.336 | 0.379 | 0.141 | 0.180 |
[DynaDepthR50]() | Mono | 119.99 | 77.98 | 0.113 | 0.128 | 0.298 | 0.152 | 0.324 | 0.549 | 0.201 | 0.532 | 0.454 | 0.318 | 0.218 | 0.125 | 0.197 | 0.418 | 0.437 | 0.382 | 0.448 | 0.153 | 0.216 |
[RA-DepthHRNet]() | Mono | 112.73 | 78.79 | 0.096 | 0.113 | 0.314 | 0.127 | 0.239 | 0.413 | 0.165 | 0.499 | 0.368 | 0.378 | 0.214 | 0.122 | 0.178 | 0.423 | 0.403 | 0.402 | 0.455 | 0.175 | 0.192 |
[TriDepthsingle]() | Mono | 109.26 | 81.56 | 0.117 | 0.131 | 0.300 | 0.188 | 0.338 | 0.498 | 0.265 | 0.268 | 0.301 | 0.212 | 0.190 | 0.126 | 0.199 | 0.418 | 0.438 | 0.380 | 0.438 | 0.142 | 0.205 |
[Lite-MonoTiny]() | Mono | 92.92 | 86.69 | 0.115 | 0.127 | 0.257 | 0.157 | 0.225 | 0.354 | 0.191 | 0.257 | 0.248 | 0.198 | 0.186 | 0.127 | 0.159 | 0.358 | 0.342 | 0.336 | 0.360 | 0.147 | 0.161 |
[Lite-MonoTiny+HR]() | Mono | 86.71 | 87.63 | 0.106 | 0.119 | 0.227 | 0.139 | 0.282 | 0.370 | 0.166 | 0.216 | 0.201 | 0.190 | 0.202 | 0.116 | 0.146 | 0.320 | 0.291 | 0.286 | 0.312 | 0.148 | 0.167 |
[Lite-MonoSmall]() | Mono | 100.34 | 84.67 | 0.115 | 0.127 | 0.251 | 0.162 | 0.251 | 0.430 | 0.238 | 0.353 | 0.282 | 0.246 | 0.204 | 0.128 | 0.161 | 0.350 | 0.336 | 0.319 | 0.356 | 0.154 | 0.164 |
[Lite-MonoSmall+HR]() | Mono | 89.90 | 86.05 | 0.105 | 0.119 | 0.263 | 0.139 | 0.263 | 0.436 | 0.167 | 0.188 | 0.181 | 0.193 | 0.214 | 0.117 | 0.147 | 0.366 | 0.354 | 0.327 | 0.355 | 0.152 | 0.157 |
[Lite-MonoBase]() | Mono | 93.16 | 85.99 | 0.110 | 0.119 | 0.259 | 0.144 | 0.245 | 0.384 | 0.177 | 0.224 | 0.237 | 0.221 | 0.196 | 0.129 | 0.175 | 0.361 | 0.340 | 0.334 | 0.363 | 0.151 | 0.165 |
[Lite-MonoBase+HR]() | Mono | 89.85 | 85.80 | 0.103 | 0.115 | 0.256 | 0.135 | 0.258 | 0.486 | 0.164 | 0.220 | 0.194 | 0.213 | 0.205 | 0.114 | 0.154 | 0.340 | 0.327 | 0.321 | 0.344 | 0.145 | 0.156 |
[Lite-MonoLarge]() | Mono | 90.75 | 85.54 | 0.102 | 0.110 | 0.227 | 0.126 | 0.255 | 0.433 | 0.149 | 0.222 | 0.225 | 0.220 | 0.192 | 0.121 | 0.148 | 0.363 | 0.348 | 0.329 | 0.362 | 0.160 | 0.184 |
[Lite-MonoLarge+HR]() | Mono | 92.01 | 83.90 | 0.096 | 0.112 | 0.241 | 0.122 | 0.280 | 0.482 | 0.141 | 0.193 | 0.194 | 0.213 | 0.222 | 0.108 | 0.140 | 0.403 | 0.404 | 0.365 | 0.407 | 0.139 | 0.182 |
[MonoDepth2R18]() | Stereo | 117.69 | 79.05 | 0.123 | 0.133 | 0.348 | 0.161 | 0.305 | 0.515 | 0.234 | 0.390 | 0.332 | 0.264 | 0.209 | 0.135 | 0.200 | 0.492 | 0.509 | 0.463 | 0.493 | 0.144 | 0.194 |
[MonoDepth2R18+nopt]() | Stereo | 128.98 | 79.20 | 0.150 | 0.181 | 0.422 | 0.292 | 0.352 | 0.435 | 0.342 | 0.266 | 0.232 | 0.217 | 0.229 | 0.156 | 0.236 | 0.539 | 0.564 | 0.521 | 0.556 | 0.164 | 0.178 |
[MonoDepth2R18+HR]() | Stereo | 111.46 | 81.65 | 0.117 | 0.132 | 0.285 | 0.167 | 0.356 | 0.529 | 0.238 | 0.432 | 0.312 | 0.279 | 0.246 | 0.130 | 0.206 | 0.343 | 0.343 | 0.322 | 0.344 | 0.150 | 0.209 |
[DepthHints]() | Stereo | 111.41 | 80.08 | 0.113 | 0.124 | 0.310 | 0.137 | 0.321 | 0.515 | 0.164 | 0.350 | 0.410 | 0.263 | 0.196 | 0.130 | 0.192 | 0.440 | 0.447 | 0.412 | 0.455 | 0.157 | 0.192 |
[DepthHintsHR]() | Stereo | 112.02 | 79.53 | 0.104 | 0.122 | 0.282 | 0.141 | 0.317 | 0.480 | 0.180 | 0.459 | 0.363 | 0.320 | 0.262 | 0.118 | 0.183 | 0.397 | 0.421 | 0.380 | 0.424 | 0.141 | 0.183 |
[DepthHintsHR+nopt]() | Stereo | 141.61 | 73.18 | 0.134 | 0.173 | 0.476 | 0.301 | 0.374 | 0.463 | 0.393 | 0.357 | 0.289 | 0.241 | 0.231 | 0.142 | 0.247 | 0.613 | 0.658 | 0.599 | 0.692 | 0.152 | 0.191 |
[MonoDepth2R18]() | M+S | 124.31 | 75.36 | 0.116 | 0.127 | 0.404 | 0.150 | 0.295 | 0.536 | 0.199 | 0.447 | 0.346 | 0.283 | 0.204 | 0.128 | 0.203 | 0.577 | 0.605 | 0.561 | 0.629 | 0.136 | 0.179 |
[MonoDepth2R18+nopt]() | M+S | 136.25 | 76.72 | 0.146 | 0.193 | 0.460 | 0.328 | 0.421 | 0.428 | 0.440 | 0.228 | 0.221 | 0.216 | 0.230 | 0.153 | 0.229 | 0.570 | 0.596 | 0.549 | 0.606 | 0.161 | 0.177 |
[MonoDepth2R18+HR]() | M+S | 106.06 | 82.44 | 0.114 | 0.129 | 0.376 | 0.155 | 0.271 | 0.582 | 0.214 | 0.393 | 0.257 | 0.230 | 0.232 | 0.123 | 0.215 | 0.326 | 0.352 | 0.317 | 0.344 | 0.138 | 0.198 |
[CADepth]() | M+S | 118.29 | 76.68 | 0.110 | 0.123 | 0.357 | 0.137 | 0.311 | 0.556 | 0.169 | 0.338 | 0.412 | 0.260 | 0.193 | 0.126 | 0.186 | 0.546 | 0.559 | 0.524 | 0.582 | 0.145 | 0.192 |
[MonoViTMPViT]() | M+S | 75.39 | 90.39 | 0.098 | 0.104 | 0.245 | 0.122 | 0.213 | 0.215 | 0.131 | 0.179 | 0.184 | 0.161 | 0.168 | 0.112 | 0.147 | 0.277 | 0.257 | 0.242 | 0.260 | 0.147 | 0.144 |
[MonoViTMPViT+HR]() | M+S | 70.79 | 90.67 | 0.090 | 0.097 | 0.221 | 0.113 | 0.217 | 0.253 | 0.113 | 0.146 | 0.159 | 0.144 | 0.175 | 0.098 | 0.138 | 0.267 | 0.246 | 0.236 | 0.246 | 0.135 | 0.145 |
Model | mCE (%) | mRR (%) | Clean | Bright | Dark | Contrast | Defocus | Glass | Motion | Zoom | Elastic | Quant | Gaussian | Impulse | Shot | ISO | Pixelate | JPEG |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
[BTSR50]() | 122.78 | 80.63 | 0.122 | 0.149 | 0.269 | 0.265 | 0.337 | 0.262 | 0.231 | 0.372 | 0.182 | 0.180 | 0.442 | 0.512 | 0.392 | 0.474 | 0.139 | 0.175 |
[AdaBinsR50]() | 134.69 | 81.62 | 0.158 | 0.179 | 0.293 | 0.289 | 0.339 | 0.280 | 0.245 | 0.390 | 0.204 | 0.216 | 0.458 | 0.519 | 0.401 | 0.481 | 0.186 | 0.211 |
[AdaBinsEfficientB5]():star: | 100.00 | 85.83 | 0.112 | 0.132 | 0.194 | 0.212 | 0.235 | 0.206 | 0.184 | 0.384 | 0.153 | 0.151 | 0.390 | 0.374 | 0.294 | 0.380 | 0.124 | 0.154 |
[DPTViT-B]() | 83.22 | 95.25 | 0.136 | 0.135 | 0.182 | 0.180 | 0.154 | 0.166 | 0.155 | 0.232 | 0.139 | 0.165 | 0.200 | 0.213 | 0.191 | 0.199 | 0.171 | 0.174 |
[SimIPUR50+no_pt]() | 200.17 | 92.52 | 0.372 | 0.388 | 0.427 | 0.448 | 0.416 | 0.401 | 0.400 | 0.433 | 0.381 | 0.391 | 0.465 | 0.471 | 0.450 | 0.461 | 0.375 | 0.378 |
[SimIPUR50+imagenet]() | 163.06 | 85.01 | 0.244 | 0.269 | 0.370 | 0.376 | 0.377 | 0.337 | 0.324 | 0.422 | 0.306 | 0.289 | 0.445 | 0.463 | 0.414 | 0.449 | 0.247 | 0.272 |
[SimIPUR50+kitti]() | 173.78 | 91.64 | 0.312 | 0.326 | 0.373 | 0.406 | 0.360 | 0.333 | 0.335 | 0.386 | 0.316 | 0.333 | 0.432 | 0.442 | 0.422 | 0.443 | 0.314 | 0.322 |
[SimIPUR50+waymo]() | 159.46 | 85.73 | 0.243 | 0.269 | 0.348 | 0.398 | 0.380 | 0.327 | 0.313 | 0.405 | 0.256 | 0.287 | 0.439 | 0.461 | 0.416 | 0.455 | 0.246 | 0.265 |
[DepthFormerSwinT_w7_1k]() | 106.34 | 87.25 | 0.125 | 0.147 | 0.279 | 0.235 | 0.220 | 0.260 | 0.191 | 0.300 | 0.175 | 0.192 | 0.294 | 0.321 | 0.289 | 0.305 | 0.161 | 0.179 |
[DepthFormerSwinT_w7_22k]() | 63.47 | 94.19 | 0.086 | 0.099 | 0.150 | 0.123 | 0.127 | 0.172 | 0.119 | 0.237 | 0.112 | 0.119 | 0.159 | 0.156 | 0.148 | 0.157 | 0.101 | 0.108 |
For more detailed benchmarking results and to access the pretrained weights used in robustness evaluation, kindly refer to RESULT.md.
You can manage to create your own "RoboDepth" corruption sets! Follow the instructions listed in CREATE.md.
If you find this work helpful, please kindly consider citing our papers:
@inproceedings{kong2023robodepth,
title = {RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions},
author = {Kong, Lingdong and Xie, Shaoyuan and Hu, Hanjiang and Ng, Lai Xing and Cottereau, Benoit R. and Ooi, Wei Tsang},
booktitle = {Advances in Neural Information Processing Systems},
year = {2023},
}
@article{kong2023robodepth_challenge,
title = {The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation},
author = {Kong, Lingdong and Niu, Yaru and Xie, Shaoyuan and Hu, Hanjiang and Ng, Lai Xing and Cottereau, Benoit and Zhao, Ding and Zhang, Liangjun and Wang, Hesheng and Ooi, Wei Tsang and Zhu, Ruijie and Song, Ziyang and Liu, Li and Zhang, Tianzhu and Yu, Jun and Jing, Mohan and Li, Pengwei and Qi, Xiaohua and Jin, Cheng and Chen, Yingfeng and Hou, Jie and Zhang, Jie and Kan, Zhen and Lin, Qiang and Peng, Liang and Li, Minglei and Xu, Di and Yang, Changpeng and Yao, Yuanqi and Wu, Gang and Kuai, Jian and Liu, Xianming and Jiang, Junjun and Huang, Jiamian and Li, Baojun and Chen, Jiale and Zhang, Shuang and Ao, Sun and Li, Zhenyu and Chen, Runze and Luo, Haiyong and Zhao, Fang and Yu, Jingze},
journal = {arXiv preprint arXiv:2307.15061},
year = {2023},
}
@misc{kong2023robodepth_benchmark,
title = {The RoboDepth Benchmark for Robust Out-of-Distribution Depth Estimation under Corruptions},
author = {Kong, Lingdong and Xie, Shaoyuan and Hu, Hanjiang and Cottereau, Benoit and Ng, Lai Xing and Ooi, Wei Tsang},
howpublished = {\url{https://github.com/ldkong1205/RoboDepth}},
year = {2023},
}
This work is under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
We thank Baidu Research for the support towards the RoboDepth Challenge.
This project is supported by DesCartes, a CNRS@CREATE program on Intelligent Modeling for Decision-Making in Critical Urban Systems.