Open andrewsonga opened 2 years ago
Thank you for your kind words!1. Yes, min_bound and max_bound are the same across all cameras. They are in world units. For multi-view, there is no heuristic implemented to determine good values, instead you'd have to come up with your own heuristic (e.g. for a spherical camera setup, the maximum distance between any two cameras might be a good first guess).2. You can first try to just set factor=4 for example and the code at https://github.com/facebookresearch/nonrigid_nerf/blob/main/train.py#L1354 will take care of adjusting the calibration (namely focal and center of the intrinsics). Extrinsics don't need to be adjusted. If that doesn't work, store the correct (downsampled) values for focal and center in calibration.json and use factor=1.Hope that helps!
Thank you for the swift response! I have just a few more follow-up questions:
1. Do we have to adjust min_bound
and max_bound
according to the downsampling factor?
2. Do you think using min_bound
s and max_bound
s in the poses_bounds.npy
file generated by running colmap as follows (https://colmap.github.io/faq.html#reconstruct-sparse-dense-model-from-known-camera-poses) constitutes a good heuristic for multi-view?
min_bound
s and max_bound
s for each camera; the shared min_bound
and max_bound
then would become the minimum and maximum, respectively, across all caermas.3. Where is min_bound
and max_bound
used? is it used as the integration bounds for volume rendering?
4. If so, what is the harm of heuristically setting min_bound
as 0 and max_bound
as a very large number?
—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android.
Thank you for the detailed response! I heeded your instructions carefully, but my renderings are coming out super weirdly and I can't seem to figure out why. The following are the first five renderings for --camera_path spiral:
The first frame of my multi-view video look like this:
Are there any modifications I need to make to free_viewpoint_rendering.py
in order to make it work for multi-view datasets? For instance, do we have to change load_llff_data
to load_llff_data_multi_view
in free_viewpoint_rendering.py
as well as train.py
?
I have never tried running the multi-view code with rendering. The spiral code might be too sensitive, you could try the static or input reconstruction rendering. Changing to load_llff_data_multi_view sounds reasonable, but again, I have not tried that part.
First of all, thank you for releasing your impactful work! I'm trying to train NRNeRF on multi-view data from 8 synchronized cameras with known intrinsics and extrinsics, and I ran into a couple questions regarding the bounds and the downsampling factor.
1. Are the parameters
min_bound
andmax_bound
defined as the minimum and maximum across all cameras?I noticed that in the README.md, there is a single
min_bound
andmax_bound
that is shared between all cameras when specifyingcalibration.json
, as opposed to there being one for each camera.2. When using
load_llff_data_multi_view
, if our training images are downsampled from their original resolution by a certain factor, are there any parts of thecalibration.json
(i.e. camera intrinsics / extrinsics) we have to accordingly adjust to account for the downsampling factor?I'm asking this question because that downsampling images by a
factor
is not implemented inload_llff_data_multi_view
, butload_llff_data
appears to be usingfactor
in a couple of cases (https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/load_llff.py#L76, https://github.com/yenchenlin/nerf-pytorch/blob/a15fd7cb363e93f933012fd1f1ad5395302f63a4/load_llff.py#L103).Thank you in advance for reading this long question. I look forward to reading your response.