Closed t2kasa closed 2 years ago
Hi @t2kasa, parse_raw_camera()
aims to convert the camera information to an extrinsic camera matrix. parse_cameras_and_bounds()
was taken from the data loading function from the original NeRF, so yes, the coordinate system of this function output would be [right, up, backwards] as you said. The differences are:
diag(1,-1,-1)
flips the coordinate system to the conventional form of [right, down, forwards].Therefore, throughout the codebase we operate all image <--> camera <--> world transformations using the camera projection equation with intrinsics K
and extrinsics [R|t]
: u = K(Rx+t)
(see these camera helper functions).
Hope this helps!
Thank you your help!
I understand L106-L107 transform the coordinate system from [right, up, backwards] to [right, down, forwards], then from camera-to-world matrix to world-to-camera matrix. https://github.com/chenhsuanlin/bundle-adjusting-NeRF/blob/main/data/llff.py#L106-L107
So, does L108 back the coordinate system to [right, up, backwards] as world-to-camera matrix? https://github.com/chenhsuanlin/bundle-adjusting-NeRF/blob/main/data/llff.py#L108
Hi @t2kasa, I may need to dig back into the original NeRF repo to double-check their pose format and how I preprocessed it back then, but I don't really have the cycles for it at this moment. However, what I can tell you is that the output of parse_raw_camera()
function is guaranteed to be in the standard extrinsic matrix format of [right, down, forwards].
Sorry for the confusion!
Thank you.
It is helpful and enough to know that the output of parse_raw_camera()
is the extrinsic matrix of [right, down, forwards]. I will try to track the transformations again later.
Thank you again!
Hi, Chen-Hsuan Lin. Thank you for sharing the great work!
I have been reading the code and I did not understand very well about camera pose transformation when calling
__getitem__
method for LLFF dataset: https://github.com/chenhsuanlin/bundle-adjusting-NeRF/blob/main/data/llff.py#L104In my understanding, camera pose in returned values from
parse_cameras_and_bounds
is camera-to-world matrix and its coordinate system is [right, up, backwards]. https://github.com/chenhsuanlin/bundle-adjusting-NeRF/blob/main/data/llff.py#L42Then, the camera pose is transformed by
parse_raw_camera
when calling__getitem__
, but I could not follow what the transformation did: https://github.com/chenhsuanlin/bundle-adjusting-NeRF/blob/main/data/llff.py#L104 Could you please let me know?