nerfstudio-project / nerfstudio

A collaboration friendly studio for NeRFs
https://docs.nerf.studio
Apache License 2.0
9.41k stars 1.28k forks source link

Is evaluation dataset used for validation while training nerfacto? #2379

Closed mylee95 closed 1 year ago

mylee95 commented 1 year ago

Hi, I want to train custom data with nerfacto and compare its test result with other models. I processed my custom dataset with ns-process-data and got images with transforms.json. When I train nerfacto with default setting, nerfacto uses 90% of data as training dataset and 10% of data as evaluation dataset.

My question is, does nerfacto use evaluation dataset for validation while training? If it does, is there some way to make test dataset that is not used while training at all, and use it only for evaluation after training is done?

maturk commented 1 year ago

The eval dataset is not used at all in training, it is just used to give you your eval metrics (you can inspect them a bit better using the --vis wandb or tensorboard flag).

mylee95 commented 1 year ago

Thank you for quick response.

Just for my curiosity, is there reason why there is no validation in nerfacto's training procedure? Is it just convention of NeRF models? or is it because wrong choice of validation set can lead to choosing wrong model with small number of data?

maturk commented 1 year ago

I think it is sort of difficult to do hyperparameter tuning with NeRFs based on validation set performance due to the fact that things like novel view synthesis and correct reflection modelling is highly dependent on the training dataset (and thus your choice of validation set). Thus the validation set is not independent from the training dataset and this is bad right? So the validation metrics are really just a proxy for performance and generalization for novel view synthesis performance, but it might be difficult to do any real hyperparameter tuning based on the performance on a val set. Validation sets are good at comparing performance on wholly different architectures however, like comparing nerfacto vs zip nerf vs mip nerf etc etc, but this is also quite dependent on the specific validation split and should be standardised across papers. I also think that there is no difference between eval and test sets in nerfs... but i may be wrong... Let me know if my understanding is correct and if this solves your doubts.

mylee95 commented 1 year ago

Yeah, I agree with your understanding about validation set performance for novel view synthesis methods. Thank you for sharing your thoughts :)