Why is the pose used for nerf calculation an identity matrix?

Brummi / BehindTheScenes

Official implementation of the paper: Behind the Scenes: Density Fields for Single View Reconstruction (CVPR 2023)

BSD 2-Clause "Simplified" License

250 stars 19 forks source link

Hi, note that we only invert the pose of the first frame (from which we make the prediction). This inverted pose then gets broadcasted and multiplied with all other frame poses. This means that the relative poses between the different frames / views remains the same, but the input frame is at (0, 0, 0) -> Identity matrix. This is not directly necessary, but makes directly querying the network easier, as you don't have to transform the points to the coordinate system of the input frame.

(Side node: poses have shape (N, V, 4, 4) where N is the batch size and V is the number of frames / views per sample.

Brummi / BehindTheScenes

Why is the pose used for nerf calculation an identity matrix? #9