nerfstudio-project / nerfstudio

A collaboration friendly studio for NeRFs
https://docs.nerf.studio
Apache License 2.0
9.45k stars 1.29k forks source link

Why is the world coord system rotated in colmap_utils.py? #1504

Open jkulhanek opened 1 year ago

jkulhanek commented 1 year ago

What is the reason the world coord system is rotated here? https://github.com/nerfstudio-project/nerfstudio/blob/079f419e544b8d2aa6aad6f1a31decf0e06cb88c/nerfstudio/process_data/colmap_utils.py#L620-L621

Could we keep the coordinates as they are in COLMAP?

pwais commented 1 year ago

COLMAP uses OpenCV camera conventions, but nerfstudio (and many other nerf impls, maybe all of them?) use OpenGL coordinate system. The code here is a bit obfuscated but it provides a precise translation.

FMI see discussions:

tancik commented 1 year ago

More info here - https://docs.nerf.studio/en/latest/quickstart/data_conventions.html#coordinate-conventions

jkulhanek commented 1 year ago

The two lines don’t change the camera coordinate system though, there is a line above the two which flips z,y to the from opencv to opengl. I thought that in colmap the orientation of world coordinate system is more or less arbitrary. Even when you run something like model_aligner the orientation is not defined. The orientation is overridden in the dataparser anyway, so it doesn’t really matter. I just thought it would be good to have a way to preserve the poses from colmap (when disabling the transformation code in the dataparser).

to the best of my knowledge, in multinerf (first referenced link) they don’t do it. They just switch from opencv to opengl here https://github.com/google-research/multinerf/blob/5d4c82831a9b94a87efada2eee6a993d530c4226/internal/datasets.py#L109

pwais commented 1 year ago

you're right, the cited code in nerfstudio is not equivalent to the cited multinerf code.. at least not the precise cited lines. The cited conversion appears to do a handedness change but then also flips z:

array([[ 0, 1,  0,  0],
       [ 1, 0,  0,  0],
       [ 0, 0, -1,  0],
       [ 0, 0,  0,  1]])

(note I'm going to ignore re-scale / re-center)

So, COLMAP is right-handed, OpenGL camera model is right-handed, but OpenGL NDC is left-handed, no? np.diag([1, -1, -1, 1]) should account for the change in camera frames, but the nerfstudio code is doing a handedness too? this is very confusing because the viewer world frame indicator is definitely right-handed, at least assuming the RGB axes all indicate positive xyz

I believe the original LLFF code used in original nerf does a handedness change as well:

Note that the original nerf code also had a special "NDC" mode for forward-facing scenes, that's a little different than what we're discussing here I think:

And instant-ngp has it too:

SDFStudio in contrast flips the camera frame back to OpenCV:

One way to break all this down is into three frames: world, camera (logical), and image plane ("sensor frame" or "screen frame"). If expected ray termination is supposed to be positive, and the screen / "sensor" frame has +x right and +y up, then that's a left-handed system, while the world and camera frames are right-handed.

So I think what we're seeing is that the Nerf model code is "leaking the abstraction of the screen space" into the data encoding? In multinerf it seems the screen space stuff is absorbed into the ray casting step: https://github.com/google-research/multinerf/blob/5d4c82831a9b94a87efada2eee6a993d530c4226/internal/camera_utils.py#L646

Now I'm curious if the camera paths that can be specified in the viewer for rendering... are those poses in a right-handed or left-handed system? ha

jkulhanek commented 1 year ago

The cited code does not change the handedness. It only rotates the world coordinate system, right?

pwais commented 1 year ago

but in opengl camera, -z is forward / look at. yet cast rays have positive lengths. So isn’t that transform a handedness change, then a flip of z?

if you delete the code, can you get it to train?

On Tue, Feb 28, 2023 at 02:22 Jonáš Kulhánek @.***> wrote:

The cited code does not change the handedness. It only rotates the world coordinate system, right? I believe we can safely remove it without any issues?

— Reply to this email directly, view it on GitHub https://github.com/nerfstudio-project/nerfstudio/issues/1504#issuecomment-1447925156, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABVIIPNTMK3RYNMTYTZIY3WZXGXFANCNFSM6AAAAAAVIAJRME . You are receiving this because you commented.Message ID: @.***>

jkulhanek commented 1 year ago

If I delete the code, it trains fine. From what I understand, the only problem would be with forward-facing scenes?

ShkarupaDC commented 1 year ago

@pwais, hi! I am sorry for the stupid question, but could you explain again:

  1. Why do we need this conversion after the OpenCV -> OpenGL coordinate system conversion?
  2. What problem with forward-facing scenes does it solve, and why?
qq456cvb commented 1 year ago

I don't think there is a handedness change in these two lines, the first line changes the handedness and the next line changes it back. You can also verify this by calculating the determinant of the matrix array([[ 0, 1, 0, 0], [ 1, 0, 0, 0], [ 0, 0, -1, 0], [ 0, 0, 0, 1]]) (det=1 for sure) So it is merely a rotation of world frame, which I think has no influence on training.

panchagil commented 1 year ago

I've faced the same problem since I wanted to visualize the camera positions along with colmap point-cloud.

The difference between COLMAP and Nerfstudio format is just on the sign.

Nerfstudio: +X is right, +Y is up, and +Z is pointing back and away from the camera. -Z is the look-at direction.

Colmap: The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image. (ref: colmap-doc)

I'm using this code instead

c2w = np.linalg.inv(w2c)
c2w[0:3, 1:3] *= -1
# that is it!
jb-ye commented 9 months ago

@jkulhanek If you want to keep the world coordinate, this is the option

--assume_colmap_world_coordinate_convention=False \
--orientation_method=none \
--center_method=none \
--auto-scale-poses=False \

which expects this behavior in colmap parser.

pwais commented 7 months ago

as i originally instigated this issue, i vote that https://github.com/nerfstudio-project/nerfstudio/pull/2793 effectively closes this one. @jkulhanek any thoughts?

pwais commented 7 months ago

Actually I think @jb-ye https://github.com/nerfstudio-project/nerfstudio/pull/2793 might have a bug? when assume_colmap_world_coordinate_convention is off, the camera poses read from colmap will change: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L163

BUT the world cloud will not change as well: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L388

That function above will only get the auto-orient-and-scale transform I think, eh?

jb-ye commented 7 months ago

Actually I think @jb-ye #2793 might have a bug? when assume_colmap_world_coordinate_convention is off, the camera poses read from colmap will change: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L163

BUT the world cloud will not change as well: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L388

That function above will only get the auto-orient-and-scale transform I think, eh?

This is the intended behavior. When the flag is turned on, an additional transform will be applied to world coordinates. Camera poses will be converted to OpenGL convention regardless of this flag.

pwais commented 7 months ago

@jb-ye but for methods like gaussian splatting, the world frame of the 3d points needs to be the same world frame referenced by the camera transforms, no? fwiw somebody was having problems with splatting hence we were trying to debug. I guess I don't understand what assume_colmap_world_coordinate_convention=False is for despite the comment in the code. Even re-reading the summary of https://github.com/nerfstudio-project/nerfstudio/pull/2793 I don't understand why / when to use assume_colmap_world_coordinate_convention=False and orientation_method=none and some others were confused too.

jb-ye commented 7 months ago

@jb-ye but for methods like gaussian splatting, the world frame of the 3d points needs to be the same world frame referenced by the camera transforms, no? fwiw somebody was having problems with splatting hence we were trying to debug. I guess I don't understand what assume_colmap_world_coordinate_convention=False is for despite the comment in the code. Even re-reading the summary of #2793 I don't understand why / when to use assume_colmap_world_coordinate_convention=False and orientation_method=none and some others were confused too.

just to clarify, the additional transform enabled by assume_colmap_world_coordinate_convention is always there from the very beginning. My PR just wrap it around a flag so we can optional turn it off.

Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file. By default, nerfstudio will first apply a fixed transform to colmap world before re-orienting it using orientation method. The only way to disable this behavior is set both option to false/none.

The 3d point clouds are transformed in the same way as we did for cameras regardless whether those options are on/off. If you believe they are inconsistent, could you tell a bit more details about why you think so?

pwais commented 7 months ago

Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file

Maybe the flag and functionality should just move moved to the viewer, since that was what the original "bug" was? Options related to export should probably be in export.py and not data parsing / input stage, and today some things like gsplat don't even actually respect the colmap frame. At least, the name assume_colmap_world_coordinate_convention on input doesn't suggest anything about exporting ...

jb-ye commented 7 months ago

Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file

Maybe the flag and functionality should just move moved to the viewer, since that was what the original "bug" was? Options related to export should probably be in export.py and not data parsing / input stage, and today some things like gsplat don't even actually respect the colmap frame. At least, the name assume_colmap_world_coordinate_convention on input doesn't suggest anything about exporting ...

unfortunately the splatfacto model can not be exported to a different world frame because one can not rotate the spherical harmonics parameters. See https://github.com/nerfstudio-project/nerfstudio/pull/2951 for similar discussion.

In other words, if one wants to recover the model in original world coordinate, one has to choose between (1) disable any nerfstudio specific transforms during loading data or (2) export a metadata file along side ply exports that recording those transforms.

pwais commented 7 months ago

@jb-ye The current issue with frames etc (and especially the really poor naming e.g. "transformation_matrix" that hides the actual frame name) has lead to a lot of confusion on Discord and I've seen is a top reason why people just use the original Inria impl. Based on this discussion it seems assume_colmap_world_coordinate_convention is indeed just for the viewer, really it should be the viewer that has an option to find "logical gravity vector" and then nobody has to worry about e.g. spherical harmonics being "in the wrong frame."

jkulhanek commented 7 months ago

Spherical harmonics can easily be rotated using Winger matrices. I have a np code somewhere if you want it

pfxuan commented 7 months ago

PlayCavas's implementation seems working great:

jb-ye commented 7 months ago

@pwais Thanks for the explanation. Just want to summarize your proposal and confirm my understanding:

(1) You want to remove assume_colmap_world_coordinate_convention flag and somehow put this transform in viewer. As such, when a user specify orientation_method=none, the exported GS is truly the original input world coordinates. The current implementation requires the user to additionally specify assume_colmap_world_coordinate_convention=False when parsing colmap data. (2) The default setting of nerfstudio is orient, center, and scale the scene (The current splatfacto requires the input scene to be scaled to 1 in order to work properly). As such, when exporting GS ply in the original world coordinate, one has to figure out a way to rotate spherical harmonics (e.g. one suggested by @jkulhanek ) to respect the original world coordinate.

In my opinion, if we implemented feature (2), (1) becomes irrelevant anyway. We may prioritize https://github.com/nerfstudio-project/nerfstudio/pull/2951 .

jb-ye commented 7 months ago

Based on this discussion it seems assume_colmap_world_coordinate_convention is indeed just for the viewer

Not entirely, nerfstudio doc suggested that the world coordinate's Up direction is +Z, while it is often not the case for a colmap project (where Up direction is -Y). But I agree this is really a convenience hack for ns-viewer. We can completely remove the assume_colmap_world_coordinate_convention and have options in ns-viewer to select gravity direction.