google-research / multinerf

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF
Apache License 2.0
3.58k stars 339 forks source link

Question about flipping coordinate system #28

Closed raywzy closed 1 year ago

raywzy commented 1 year ago

In https://github.com/google-research/multinerf/blob/247fe77c2933edc70d8a222ca10808413b387797/internal/camera_utils.py#L217, after aligning the coordinate with the princinple components of the poses, there is a flipping operation. What is the motivation of this step? Quite confused about that.

bmild commented 1 year ago

This is a bit of a subtle point. Running PCA on the camera positions will align the two largest principal component directions with the world space x and y axes -- typically, this will make the xy plane parallel to the ground, if you are walking around capturing pictures (since the variation in camera height will be much less than the variation resulting from you walking).

So ideally, we end up with the world xy plane as the ground and the z axis perpendicular to that. But of course, this means the z axis might be pointing either straight up or straight down -- the point of this flip is to catch the case where z ends up pointing down and correcting it by negating the world y and z axes, essentially rotating the world by 180 degrees around the x axis.

The reason we check the z component of the camera pose's y axis in particular is because the camera coordinate convention in this code is that the y axis of a camera rotation matrix points up in the image plane, so that's the camera rotation vector we want to be aligned with position world space z.

raywzy commented 1 year ago

@bmild Thanks for your comprehensive explaination!