google / nerfies

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.
https://nerfies.github.io
Apache License 2.0
1.64k stars 221 forks source link

Camera coordinate and bds #39

Open dogyoonlee opened 2 years ago

dogyoonlee commented 2 years ago

Hello!

I really appreciate to your great work!!

However, I have some questions about camera coordinate on nerfies dataset you released.

I want to use the NeRF code which is implemented on pytorch framework and it provide llff dataset dataloader.

As I know, llff dataset uses camera poses which are saved as format of [-u, r, -t]

and they load the saved poses and rotate them to [r, u, -t] following codes.

image

Since I want to use your dataset(Nerfies) and directory format, I loaded saved poses in camera directories.

image

As I know, you uses opencv camera model whose coordinate axis is [r, -u, t]

So I rotate poses after load the saved poses of your dataset following codes to convert the pose as llff dataset format [-u, r, -t] (I loaded poses and create poses matrix whose shape is 3 x 5 x frame_number, and 3x5 poses matrix consisting of 3x4 pose matrix and 3x1 Height, Width, focal length as llff format)

image

Finally, I generate rays to train our network with undistortion process which you given in nerfies code and I think it works.

but training didn't work well.

I think the reason is that loaded camera poses are wrong because network was trained well when I use broom dataset which is provided as llff format by NSFF author's github(but they don't provide validation dataset. only training data is provided)

I really wonder what is the problem when load the cameras.

Is the saved camera coordinate system [r, -u, t] correct?

If it is correct, then what should I do additionally to get correct camera pose?

I also applied scene_scale, scene_center, principal_points, undistortions as you did in nerfies code.

It will be really helpful if you give answer about the questions!

Thank you.

keunhong commented 2 years ago

Hi Dogyoon.

  1. We don't use NDC coordinates as NeRF does for some of the LLFF scenes. We use normal Euclidean coordinates, in which case the specific coordinate convention doesn't matter as long as it's consistent within the dataset. That being said, we use the OpenCV convention.
  2. NSFF and the original NeRF paper pre-undistort the images with COLMAP. We instead do the undistortion on the fly with our camera class. This has some benefits, one of which is that if you pre-undistort you have to throw away some of the edge pixels.

If it's not training well there might be a bug in how the rays are generated in your code. I'd create a simple notebook and see if the generated rays are consistent between the JAX version and your PyTorch version.

You could also work around it by pre-undistorting the datasets with COLMAP and set the distortion to zero, but you'd have to write some code to load in the cameras from the other formats.