ken2576 / vision-nerf

Official PyTorch Implementation of paper "Vision Transformer for NeRF-Based View Synthesis from a Single Input Image", WACV 2023.
MIT License
107 stars 12 forks source link

Question about the number of input images #8

Closed Misaliet closed 1 year ago

Misaliet commented 1 year ago

Hi,

In your paper, you said the input is one single image. However, after reading your code, it seems that you use all other images except the target view image as the input of the transformer whether it is in the training phase or the generation phase? This is not consistent with what you describe in your paper. I'm curious why it is.

Misaliet commented 1 year ago

Sorry I forgot to mention that I was reading your "eval.py" and "train.py" code for ShapeNet car data.

Misaliet commented 1 year ago

I figured it out after running your code. It is quite confusing only read the loop code here: https://github.com/ken2576/vision-nerf/blob/c184501fc5609382ba79937ffbcd479a16a624e3/data/srn.py#L154.

ken2576 commented 1 year ago

Hi,

Sorry, the name for the ray sampler could be confusing https://github.com/ken2576/vision-nerf/blob/main/models/sample_ray.py#L206 It's actually doing batched operation, where we are training with multiple objects, but each only has 1 input and 1 target view.