nianticlabs / manydepth

[CVPR 2021] Self-supervised depth estimation from short sequences
Other
607 stars 84 forks source link

Moving object from left to right #34

Closed biggiantpigeon closed 2 years ago

biggiantpigeon commented 2 years ago

Thanks for the excellent work.

I see you use self-supervised training to deal with the cost_volume overfitting, so the network can predict fine with multi-frame when there is a moving object moving in front of and in the same direction of the camera, like front cars.

I also test with your model, to predict with multi-frame, when an object is moving from left to right in front of the camera, I thought is would give a wrong result, but the result is just fine, which is what I don't understand. I think in this case, the cost volume will have two image sections--which are regions of the moving object in two frames--that cannot find a match in any depth. So why does it still predict fine? Can I trust this result?

Just to demostrate, I post these two image, but this is not what I use to train/predict: image image

JamieWatson683 commented 2 years ago

Hi - sorry for the delay in responding!

If I understand the question: you would expect the predictions for an object moving left to right to be bad because there will be no good match in the cost volume?

Yes you are right that the cost volume will not be able to find a good match - but the teacher network should help train the network to produce sensible depths in these areas. The consistency mask is where the minimum of the cost volume disagrees with the prediction of the teacher network - this should happen both when there is an object moving away/towards the camera and when moving left to right. Of course if the teacher network incorrectly predicts depth for the object, then we will obtain bad predictions.

I hope this helps!