nianticlabs / monodepth2

[ICCV 2019] Monocular depth estimation from a single image
Other
4.13k stars 953 forks source link

Confusion regarding transformation #435

Closed zshn25 closed 2 years ago

zshn25 commented 2 years ago

Do you transform the target points to source or vice versa?

In the code, you seem to do the former but from the paper you claim to do the latter.

Isnt the transformation matrix T here, the transformation from target to source?

https://github.com/nianticlabs/monodepth2/blob/b676244e5a1ca55564eb5d16ab521a48f823af31/layers.py#L183

xiat0616 commented 2 years ago

I am also confused by this. In fact, if we use translation vector [0, 0, 0.1]. The image will zoom in, but z in the camera coordinates will increase. Which does not make sense to me

daniyar-niantic commented 2 years ago

T here is transformation applied to 3D points originating from target camera, so that these 3D points can be projected to source cameras.

The transformations are applied to estimate coordinates in source images, to be able to fetch colors from source images to compare against colors in the target image.

zshn25 commented 2 years ago

That makes T, the transformation from target to source. Am I right?