Confusion about generalization of NeRF

ken2576 / vision-nerf

Official PyTorch Implementation of paper "Vision Transformer for NeRF-Based View Synthesis from a Single Input Image", WACV 2023.

MIT License

107 stars 12 forks source link

Thank your great work！ I have some confusion about NeRF generalizability.

Your paper title says that only need a single image to synthesize novel image, and And what is the function of the pre-training weights you provide? Pre-training weights how you get them？

Are the pre-training weights used to extract the global and local features of a single input image, and then use NeRF MLP to obtain target view?

The original NeRF needs to input dozens to hundreds of pictures of a scene, and after training, it can generate any new perspective of the scene. Although you only input a single image, you train a network on the image data set to extract global features and local features. What is the difference between input many images in this original nerf?

Sorry, I don't understand the generalizability of NeRF, I'd appreciate your reply, thanks!

ken2576 / vision-nerf

Confusion about generalization of NeRF #14