bmild / nerf

Code release for NeRF (Neural Radiance Fields)
http://tancik.com/nerf
MIT License
9.58k stars 1.34k forks source link

Questions about using a custom dataset #136

Open andrewsonga opened 2 years ago

andrewsonga commented 2 years ago

Hello, I'm also trying to train NeRF with 360 captures scene (the AIST++ dataset https://aistdancedb.ongaaccel.jp)

basic_dance-2

and I had some questions related to formatting the dataset and setting configurations:

  1. For real, 360 captured scenes like in the AIST++ dataset, should I be using the "LLFF" or "blender" dataset format?

  2. If one uses the "LLFF" dataset format and the --spherify flag (as recommended in README.md), then how can one generate the render_poses for 360 degrees free-viewpoint video such that they lie in the same plane as the original 8 cameras?

    • the render_poses outputed by the spherify method doesn't lie on the same plane as does my 8 cameras do, as shown below in the visualization of the camera poses with pytransform3d
Screen Shot 2021-12-15 at 11 35 57 AM
  1. When using the --no_ndc flag, how should we set the near and far bounds?
    • I tried using COLMAP for these, but the dataset I'm trying to use (the AIST++ dataset https://aistdancedb.ongaaccel.jp) only has 8 views and therefore doesn't pickup enough keypoints to do reconstruction
    • would it be ok to set near and far as small enough and large enough values (e.g. 0. and a value large than distance between opposite cameras) respectively?
    • can near and far be understood as the closest and furthest distance along the camera-axis (z-axis) where there exists some scene content?

Thank you in advance for reading this long post!

bishengderen commented 2 years ago

I also encountered the same problem. I don't know how to solve it

gkouros commented 2 years ago

I also had the same issue and my solution was to change the calculation of the up vector in function spherify_poses. Instead of averaging the camera-to-center vectors to find the up vector, I calculate the normal of the plane derived from the camera positions. This works only for poses roughly in a plane and not for poses in a hemisphere like in most 360 scenes. See the images below, where the left plots are of a scene captured in a hemispherical trajectory and the right one is of a scene captured in a circular trajectory. The rendered poses are in blue and the transformed camera poses are in green.

With original up vector calculation: Screenshot from 2022-05-03 11-11-06

With proposed up vector calculation: Screenshot from 2022-05-03 11-11-10

I calculated the new up vector as follows:

svd = np.linalg.svd((poses[:, :3, 3] - center).T)
up = svd[0][:, -1]

You can also switch automatically between the two up vector calculations by checking the std of the eigenvalues (svd[1]) which is higher for circular captures and lower for hemispherical ones. I found a threshold of 8 works for the datasets I have tested.

I hope that helps. :)

WeiLi-THU commented 2 years ago

These plots are very useful for clarification and visualization, it would be nice if can give more detail on the plotting methods.

gkouros commented 2 years ago

@WeiLi-THU For the visualization, I used a library from github called extrinsic2pyramid.

vanshilshah97 commented 1 year ago

Has anybody solved this issue regarding near and far bounds ?