google-research / sparf

This is the official code release for SPARF: Neural Radiance Fields from Sparse and Noisy Poses [CVPR 2023-Highlight]
https://prunetruong.com/sparf.github.io/
Apache License 2.0
285 stars 15 forks source link

Question regarding the LLFF data preprocessing #2

Closed xiexh20 closed 1 year ago

xiexh20 commented 1 year ago

Hi,

Thanks a lot for the amazing work and releasing the code.

I am a bit confused about the data preprocessing steps you applied to the LLFF data.

  1. If I understand it correctly, in the code snippet below, the camera pose is swapped to change from [down right back] to [right up back]. reference in NerF repo. However, why is the last axis being swapped instead of the second axis? as shown in the original code https://github.com/google-research/sparf/blob/3fd450eae888724d88da89f0b4bf220529a48ad6/source/datasets/llff.py#L95-L98

  2. Why a pose flip is applied to the raw camera to world pose? https://github.com/google-research/sparf/blob/3fd450eae888724d88da89f0b4bf220529a48ad6/source/datasets/llff.py#L182-L187

Thank you so much for your time and explaination!

PruneTruong commented 1 year ago

Hi,

For the LLFF dataset, I borrowed the dataloader used in BARF. I don't have all the details in mind anymore, but I think they flip all the poses such as they face towards +z direction, whereas in the original dataset, they face in -z direction. This is indicated in https://github.com/chenhsuanlin/bundle-adjusting-NeRF/blob/main/camera.py#L254. I believe that explains why there is an additional flip in the dataloader. I also remember comparing to other dataloaders, and the poses were flipped such that they were facing +z instead of -z.

The ouput of parse_raw_camera should be in W2C opencv format (like in BARF). Hope that helps.

xiexh20 commented 1 year ago

Thank you so much for your explanation. That helps a lot! Now I understand the key is that the output of parse_raw_camera is the w2c in opencv format.