Closed wtpro closed 1 year ago
Hi, see this line : https://github.com/ClementPinard/SfmLearner-Pytorch/blob/9640fbb2157be78e3eb195287ed76ac797282113/inverse_warp.py#L40
We multiply everything by depth afterward.
The key here is that we don't need depth until the very end, because perspective never changes with depth : if you only have one eye, you can't differentiate a real house to a smaller but closer minutre version of it. The only way to make the difference is to know the magnitude of your displacement.
So we fix it to 1 only at the end, when we have the 3D point (x, y, 1) we can multiply everythong with depth to get (X, Y, Z) = (xdepth, ydepth, depth)
Thanks for the reply. Now I understand everything!
I am a little bit confused about the function set_id_grid, it makes so that the pixel_coords only contains element like (x,y,1), so does this mean that the actual depth value from the predicted depth map is never used during the whole inverse warp process?
It looks to me that all the depth values are now 1 instead of the predicted value.