EastbeanZhang / Gaussian-Wild

Official implementation of the paper "Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections"
125 stars 9 forks source link

Details about the experiment #5

Open Kidleyh opened 3 months ago

Kidleyh commented 3 months ago

Hello, thank you very much for your work! Could you provide the specific division of the test view for each scene?

Kidleyh commented 3 months ago

I also have a question about how to obtain dynamic appearance features when rendering new perspective images?

EastbeanZhang commented 3 months ago

For the scenes of the Brandenburg Gate, the Trevi Fountain, and the Sacre Coeur, you can download the train/test split from the NeRF-W(https://nerf-w.github.io/). When rendering novel views, the dynamic appearance features still utilize the same features extracted from the input image.

Kidleyh commented 3 months ago

Sorry, I don't quite understand what the input image refers to. According to the paper, does the dynamic appearance features need to be learned from reference images?

EastbeanZhang commented 3 months ago

Yes, for each appearance, dynamic appearance features are extracted from a reference image.

Kidleyh commented 3 months ago

Do you mean when rendering novel views, we also need to input reference images for dynamic appearance feature learning?

EastbeanZhang commented 3 months ago

The reference image is extracted once, and the extracted features are continuously used when rendering novel views.

Kidleyh commented 3 months ago

OK, thanks for your answer.

yt2639 commented 3 months ago

Hi @EastbeanZhang , thanks for your great work. I have the same question and I am not sure if I understand your answer above.

My question is: how do you do evaluation, i.e. novel view synthesis? Your model requires extracting a feature map from the reference image (the GT image). However, during novel view synthesis, there isn't a reference image available.

I read your answer above:

When rendering novel views, the dynamic appearance features still utilize the same features extracted from the input image.

Do you mean: when evaluating your model on novel views, you use the average of all training views' feature map as THE feature map for the novel views?

Thank you for your assistance.

EastbeanZhang commented 3 months ago

In the dataset, each image is captured with different camera parameters and environments, and the appearance style of the buildings in each image is different. When evaluating the test set, the appearance features of the test images (reference images) need to be extracted (e.g., in Ha-NeRF, CR-NeRF), and then applied to the appearance of the Gaussian points for corresponding adaptation, followed by rendering from the same viewpoint as the test image for metric calculation. However, when you extract the appearance features from a reference image at a certain viewpoint and apply them to the Gaussian points, we can render any other viewpoints. The appearance presented in these viewpoints will conform to the appearance style of the reference image.

yt2639 commented 3 months ago

Hi @EastbeanZhang , thank you very much for your response. Can I understand your answer as you did use the test images as the reference images to extract appearance features, same as what Ha-NeRF and CR-NeRF did. And then you use those extracted appearance features along with other features to calculate the rendered color $\mathbf{c}_i$?

Thanks!

EastbeanZhang commented 3 months ago

Yes. Best regards!