Optimal image size and aspect ratio for metric depth estimation

Thank you all for the really awesome code and research! I am interested in metric depth estimation on KITTI. The metric_depth/evaluate.py script drastically changes aspect ratio of input images of KITTI dataset, such that output depth map is 518x392 pixels, whereas input images are 1216x352, this is enabled by passing mode="eval" to get_config() function call. Evaluating the model with changed aspect ration results in higher accuracy:

{'a1': 0.987, 'a2': 0.998, 'a3': 0.999, 'abs_rel': 0.04, 'rmse': 1.824, 'log_10': 0.017, 'rmse_log': 0.062, 'silog': 5.806, 'sq_rel': 0.107}

compared to running depth estimation on full resolution and original aspect ratio (mode="infer"):

{'a1': 0.982, 'a2': 0.997, 'a3': 0.999, 'abs_rel': 0.056, 'rmse': 2.153, 'log_10': 0.025, 'rmse_log': 0.078, 'silog': 6.881, 'sq_rel': 0.152}

I wanted to confirm what is the optimal image size and aspect ratio for metric depth estimation.

Thanks, Eldar.

LiheYoung / Depth-Anything

Optimal image size and aspect ratio for metric depth estimation #54