Open jkulhanek opened 1 year ago
COLMAP uses OpenCV camera conventions, but nerfstudio (and many other nerf impls, maybe all of them?) use OpenGL coordinate system. The code here is a bit obfuscated but it provides a precise translation.
FMI see discussions:
The two lines don’t change the camera coordinate system though, there is a line above the two which flips z,y to the from opencv to opengl. I thought that in colmap the orientation of world coordinate system is more or less arbitrary. Even when you run something like model_aligner the orientation is not defined. The orientation is overridden in the dataparser anyway, so it doesn’t really matter. I just thought it would be good to have a way to preserve the poses from colmap (when disabling the transformation code in the dataparser).
to the best of my knowledge, in multinerf (first referenced link) they don’t do it. They just switch from opencv to opengl here https://github.com/google-research/multinerf/blob/5d4c82831a9b94a87efada2eee6a993d530c4226/internal/datasets.py#L109
you're right, the cited code in nerfstudio is not equivalent to the cited multinerf code.. at least not the precise cited lines. The cited conversion appears to do a handedness change but then also flips z:
array([[ 0, 1, 0, 0],
[ 1, 0, 0, 0],
[ 0, 0, -1, 0],
[ 0, 0, 0, 1]])
(note I'm going to ignore re-scale / re-center)
So, COLMAP is right-handed, OpenGL camera model is right-handed, but OpenGL NDC is left-handed, no? np.diag([1, -1, -1, 1])
should account for the change in camera frames, but the nerfstudio code is doing a handedness too? this is very confusing because the viewer world frame indicator is definitely right-handed, at least assuming the RGB axes all indicate positive xyz
I believe the original LLFF code used in original nerf does a handedness change as well:
Note that the original nerf code also had a special "NDC" mode for forward-facing scenes, that's a little different than what we're discussing here I think:
And instant-ngp has it too:
SDFStudio in contrast flips the camera frame back to OpenCV:
One way to break all this down is into three frames: world, camera (logical), and image plane ("sensor frame" or "screen frame"). If expected ray termination is supposed to be positive, and the screen / "sensor" frame has +x right and +y up, then that's a left-handed system, while the world and camera frames are right-handed.
So I think what we're seeing is that the Nerf model code is "leaking the abstraction of the screen space" into the data encoding? In multinerf it seems the screen space stuff is absorbed into the ray casting step: https://github.com/google-research/multinerf/blob/5d4c82831a9b94a87efada2eee6a993d530c4226/internal/camera_utils.py#L646
Now I'm curious if the camera paths that can be specified in the viewer for rendering... are those poses in a right-handed or left-handed system? ha
The cited code does not change the handedness. It only rotates the world coordinate system, right?
but in opengl camera, -z is forward / look at. yet cast rays have positive lengths. So isn’t that transform a handedness change, then a flip of z?
if you delete the code, can you get it to train?
On Tue, Feb 28, 2023 at 02:22 Jonáš Kulhánek @.***> wrote:
The cited code does not change the handedness. It only rotates the world coordinate system, right? I believe we can safely remove it without any issues?
— Reply to this email directly, view it on GitHub https://github.com/nerfstudio-project/nerfstudio/issues/1504#issuecomment-1447925156, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABVIIPNTMK3RYNMTYTZIY3WZXGXFANCNFSM6AAAAAAVIAJRME . You are receiving this because you commented.Message ID: @.***>
If I delete the code, it trains fine. From what I understand, the only problem would be with forward-facing scenes?
@pwais, hi! I am sorry for the stupid question, but could you explain again:
I don't think there is a handedness change in these two lines, the first line changes the handedness and the next line changes it back. You can also verify this by calculating the determinant of the matrix
array([[ 0, 1, 0, 0], [ 1, 0, 0, 0], [ 0, 0, -1, 0], [ 0, 0, 0, 1]])
(det=1 for sure)
So it is merely a rotation of world frame, which I think has no influence on training.
I've faced the same problem since I wanted to visualize the camera positions along with colmap point-cloud.
The difference between COLMAP and Nerfstudio format is just on the sign.
Nerfstudio: +X is right, +Y is up, and +Z is pointing back and away from the camera. -Z is the look-at direction.
Colmap: The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image. (ref: colmap-doc)
I'm using this code instead
c2w = np.linalg.inv(w2c)
c2w[0:3, 1:3] *= -1
# that is it!
@jkulhanek If you want to keep the world coordinate, this is the option
--assume_colmap_world_coordinate_convention=False \
--orientation_method=none \
--center_method=none \
--auto-scale-poses=False \
which expects this behavior in colmap parser.
as i originally instigated this issue, i vote that https://github.com/nerfstudio-project/nerfstudio/pull/2793 effectively closes this one. @jkulhanek any thoughts?
Actually I think @jb-ye https://github.com/nerfstudio-project/nerfstudio/pull/2793 might have a bug? when assume_colmap_world_coordinate_convention
is off, the camera poses read from colmap will change: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L163
BUT the world cloud will not change as well: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L388
That function above will only get the auto-orient-and-scale transform I think, eh?
Actually I think @jb-ye #2793 might have a bug? when
assume_colmap_world_coordinate_convention
is off, the camera poses read from colmap will change: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L163BUT the world cloud will not change as well: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L388
That function above will only get the auto-orient-and-scale transform I think, eh?
This is the intended behavior. When the flag is turned on, an additional transform will be applied to world coordinates. Camera poses will be converted to OpenGL convention regardless of this flag.
@jb-ye but for methods like gaussian splatting, the world frame of the 3d points needs to be the same world frame referenced by the camera transforms, no? fwiw somebody was having problems with splatting hence we were trying to debug. I guess I don't understand what assume_colmap_world_coordinate_convention=False
is for despite the comment in the code. Even re-reading the summary of https://github.com/nerfstudio-project/nerfstudio/pull/2793 I don't understand why / when to use assume_colmap_world_coordinate_convention=False
and orientation_method=none
and some others were confused too.
@jb-ye but for methods like gaussian splatting, the world frame of the 3d points needs to be the same world frame referenced by the camera transforms, no? fwiw somebody was having problems with splatting hence we were trying to debug. I guess I don't understand what
assume_colmap_world_coordinate_convention=False
is for despite the comment in the code. Even re-reading the summary of #2793 I don't understand why / when to useassume_colmap_world_coordinate_convention=False
andorientation_method=none
and some others were confused too.
just to clarify, the additional transform enabled by assume_colmap_world_coordinate_convention is always there from the very beginning. My PR just wrap it around a flag so we can optional turn it off.
Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file. By default, nerfstudio will first apply a fixed transform to colmap world before re-orienting it using orientation method. The only way to disable this behavior is set both option to false/none.
The 3d point clouds are transformed in the same way as we did for cameras regardless whether those options are on/off. If you believe they are inconsistent, could you tell a bit more details about why you think so?
Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file
Maybe the flag and functionality should just move moved to the viewer, since that was what the original "bug" was? Options related to export should probably be in export.py and not data parsing / input stage, and today some things like gsplat don't even actually respect the colmap frame. At least, the name assume_colmap_world_coordinate_convention
on input doesn't suggest anything about exporting ...
Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file
Maybe the flag and functionality should just move moved to the viewer, since that was what the original "bug" was? Options related to export should probably be in export.py and not data parsing / input stage, and today some things like gsplat don't even actually respect the colmap frame. At least, the name
assume_colmap_world_coordinate_convention
on input doesn't suggest anything about exporting ...
unfortunately the splatfacto model can not be exported to a different world frame because one can not rotate the spherical harmonics parameters. See https://github.com/nerfstudio-project/nerfstudio/pull/2951 for similar discussion.
In other words, if one wants to recover the model in original world coordinate, one has to choose between (1) disable any nerfstudio specific transforms during loading data or (2) export a metadata file along side ply exports that recording those transforms.
@jb-ye The current issue with frames etc (and especially the really poor naming e.g. "transformation_matrix" that hides the actual frame name) has lead to a lot of confusion on Discord and I've seen is a top reason why people just use the original Inria impl. Based on this discussion it seems assume_colmap_world_coordinate_convention
is indeed just for the viewer, really it should be the viewer that has an option to find "logical gravity vector" and then nobody has to worry about e.g. spherical harmonics being "in the wrong frame."
Spherical harmonics can easily be rotated using Winger matrices. I have a np code somewhere if you want it
PlayCavas's implementation seems working great:
@pwais Thanks for the explanation. Just want to summarize your proposal and confirm my understanding:
(1) You want to remove assume_colmap_world_coordinate_convention
flag and somehow put this transform in viewer. As such, when a user specify orientation_method=none, the exported GS is truly the original input world coordinates. The current implementation requires the user to additionally specify assume_colmap_world_coordinate_convention=False
when parsing colmap data.
(2) The default setting of nerfstudio is orient, center, and scale the scene (The current splatfacto requires the input scene to be scaled to 1 in order to work properly). As such, when exporting GS ply in the original world coordinate, one has to figure out a way to rotate spherical harmonics (e.g. one suggested by @jkulhanek ) to respect the original world coordinate.
In my opinion, if we implemented feature (2), (1) becomes irrelevant anyway. We may prioritize https://github.com/nerfstudio-project/nerfstudio/pull/2951 .
Based on this discussion it seems assume_colmap_world_coordinate_convention is indeed just for the viewer
Not entirely, nerfstudio doc suggested that the world coordinate's Up direction is +Z, while it is often not the case for a colmap project (where Up direction is -Y). But I agree this is really a convenience hack for ns-viewer. We can completely remove the assume_colmap_world_coordinate_convention
and have options in ns-viewer to select gravity direction.
What is the reason the world coord system is rotated here? https://github.com/nerfstudio-project/nerfstudio/blob/079f419e544b8d2aa6aad6f1a31decf0e06cb88c/nerfstudio/process_data/colmap_utils.py#L620-L621
Could we keep the coordinates as they are in COLMAP?