Brummi / BehindTheScenes

Official implementation of the paper: Behind the Scenes: Density Fields for Single View Reconstruction (CVPR 2023)
https://fwmb.github.io/bts/
BSD 2-Clause "Simplified" License
250 stars 19 forks source link

Why is the pose used for nerf calculation an identity matrix? #9

Closed rockywind closed 1 year ago

rockywind commented 1 year ago

Hi, I am confused about that the pose matrix is ​​inverted and multiplied by itself, isn't that the identity matrix? image

Brummi commented 1 year ago

Hi, note that we only invert the pose of the first frame (from which we make the prediction). This inverted pose then gets broadcasted and multiplied with all other frame poses. This means that the relative poses between the different frames / views remains the same, but the input frame is at (0, 0, 0) -> Identity matrix. This is not directly necessary, but makes directly querying the network easier, as you don't have to transform the points to the coordinate system of the input frame.

(Side node: poses have shape (N, V, 4, 4) where N is the batch size and V is the number of frames / views per sample.