why using inverse depth instead predict depth directly

blacksino commented 4 years ago

Thank you for your excellent work! I have some doubts about when it comes to pred layer,Zhou use sigmoid ,then reverse it to calculate photometric loss,why?

ClementPinard commented 4 years ago

The main consideration is the proximity with disparity

In a stereo rig, the disparity is proportional to inverse depth. And since disparity is a 1D optical flow map, first methods that comes to mind when estimating it are borrowed from optical flow.

In this case, the main basis is FlowNet, an optical flow neural network. The authors later extended their network to DispNet, to get depth from stereo.

Now that they had a well working network that actually outputs inverse depth, it made sense to Zhou et al to use the same network, since it worked so well.

As such, Zhou's network outputs inverse depth. But this output is inverted, because photometric loss needs depth in the general case of displacement (because stereo was just a perfect lateral translation)

For my PhD defense I made several slides to explain it, you can get it here, interesting slides start at 10 (sorry it's in french, but you will get the math)

Bottom line, outputting inverse depth in this particular case has no real justification other than legacy, you can output depth if you want.

blacksino commented 4 years ago

That helps A LOT. Thank U for your explanation!

elenacliu commented 1 year ago

@ClementPinard Hey, it seems that the url for your PhD defense is invalid now.

ClementPinard commented 1 year ago

Indeed I just discovered that my website had been kicked out of my university, I updated the links :)

you can find the slides here https://clementpinard.fr/pdf/PhdThesis/robust_depth_learning_defense.pdf

Project page (with slides and the manuscript) here : https://clementpinard.fr/phd_thesis

ClementPinard / SfmLearner-Pytorch

why using inverse depth instead predict depth directly #84