network and latent details

facebookresearch / Neural_3D_Video

The repository for CVPR 2022 Paper "Neural 3D Video Synthesis"

Other

258 stars 9 forks source link

Hi, it might be because of my carelessness, but I might have missed some details of the model in this paper.

is the model architecture almost identical to original NeRF? (except for the activation difference)
how do the latents input to the network? In nerfies / nrnerf, there is a deformable field taking xyz and latent; I didn't see anything similar in this paper, does it directly concat with xyz as input to the main MLP?

Additionally, there are two other topics that confuse me a bit

DyNeRF only needs the reconstruction loss, unlike other papers who need a lot of regularizations or/and priors. Is it because the cameras are fixed, and no deform field is used?
DyNeRF seems to naturally handle complex scenarios like topological changes. I wonder why, and I think it is also related to how latent input to the network

Thank you for this impressive work, and look forward to your reply.

Sorry I did not pay attention to this comment earlier. Reply is late.

is the model architecture almost identical to original NeRF

For coordinate MLP, yes.

how do the latents input to the network

The latent code is concatenated to the 5D input, and fed into the mlp network.

DyNeRF only needs the reconstruction loss, unlike other papers who need a lot of regularizations or/and priors. Is it because the cameras are fixed, and no deform field is used

We don't find the additional regularization is needed. Deformation field is also trivially helpful from our previous experiments. I don't want to make a hard claim that they are not useful, but at least they are not necessary to produce the results.

DyNeRF seems to naturally handle complex scenarios like topological changes. I wonder why, and I think it is also related to how latent input to the network

This is one insight explained in the paper. The latent code is general enough to handle any time variant information, including topological changes.

facebookresearch / Neural_3D_Video

network and latent details #4