Introduce FPSampler for train cameras

jb-ye commented 4 months ago

Currently, the train cameras are sampled uniformly random without replacement in full_images_manager. There are cases when one wants to implement a different camera sampler other than uniformly random sampler. For example, this PR implements a type of sampler as an alternative option that can avoid oversampling cameras that are very close to each other (using farthest distance sampling), because I observed that oversampling train cameras from very similar views can potential create unwanted floaters.

Choosing an optimal distribution over a given set of cameras to train is a non-trivial problem and have been discussed in literature (e.g. https://arxiv.org/pdf/2109.02369 ). But this topic goes beyond the scope of this PR.

This PR also fixed the seed of train camera sampler for improving reproducibility of experiments. One can also set a different seed from command argument.

maturk commented 4 months ago

@jb-ye, very cool idea! Do you have examples showing what kind of differences one could potentially expect using the random or fps method?

jb-ye commented 4 months ago

@maturk It highly depends on the distribution of train cameras. Most standard benchmark dataset you probably won't notice any difference. But on many of our internal datasets (acquired from a scan machine), you see improvements. Here is a nice example to your ask:

https://github.com/nerfstudio-project/nerfstudio/assets/132313008/cce95487-52ca-44b5-b71a-10460852d5ab

nerfstudio-project / nerfstudio

Introduce FPSampler for train cameras #3177