aim-uofa / AdelaiDepth

This repo contains the projects: 'Virtual Normal', 'DiverseDepth', and '3D Scene Shape'. They aim to solve the monocular depth estimation, 3D scene reconstruction from single image problems.
Creative Commons Zero v1.0 Universal
1.07k stars 144 forks source link

Question on regression loss #3

Closed mzy97 closed 3 years ago

mzy97 commented 3 years ago

what is the difference between "Image-level normalized regression loss" and shift_scale invariant loss based on median of a sample in (MiDaS). And what is the benefit of your proposed loss compared to MiDaS's.

YvanYin commented 3 years ago

Our Image-level regression loss only need to normalize the ground truth depth. By contrast, MiDaS has to processing both the ground truth and predicted depth. The process can be the normalization or least square fitting. According to our ablation studies, our loss can achieve better performance.

mzy97 commented 3 years ago

Ok, do you have any idea why your proposed loss better than MiDaS's? In the paper, it seems just a numerical comparison but not given some explanation. Thank you

YvanYin commented 3 years ago

MiDaS loss has to adjust both the ground truth and prediction to a similar numerical range. I am conjunct that such adjustment may be not that stable.