Conversion from Disparity to Depth in Monocular Training

zzzzzyh111 commented 1 year ago

Hi, Thanks for sharing this splendid work! I encounter a question when doing the monocular training. It seems that in the below code:

https://github.com/nianticlabs/monodepth2/blob/b676244e5a1ca55564eb5d16ab521a48f823af31/layers.py#L16-L25

Depth is set to be the inverse of the scaled disparity. However, I think the relationship between depth and disparity should be depicted like this: $Disparity =\frac{Baseline \times Focal\ Length}{Depth}$, and there is no “Baseline” in Monocular Training.

Hence, my question is how this "Baseline" is determined in monocular training, or in other words, why the depth value in monocular training can be obtained following the $1 / scaled\ disparity$ pattern?

Thanks in advance!

JerryPW commented 1 year ago

waiting for answers, too!

daniyar-niantic commented 1 year ago

Hi @zzzzzyh111 The disp in the monodepth2 is only loosely related to stereo disparity. The better term would be "rescaled inverse depth". So, the network is trained to predict depth and wee chose to let the network predict some value that can be converted to depth through disp_to_depth function.

nianticlabs / monodepth2

Conversion from Disparity to Depth in Monocular Training #465