Open initialneil opened 4 months ago
I find this part weird too. It sorts the columns yet takes the row as the normal. But I still can't understand your modification. I think you should sort the columns instead of sorting the rows. According to linear algebra, the columns are the eigenvectors, which represent the direction after the rotation transformation. So R_sorted[:,0,:]
should be changed to R_sorted[:,:,0]
as you first mentioned. GaussianPro also sorts the column, just like my opinion.
rotations_mat = build_rotation(rotations)
scales = pc.get_scaling
min_scales = torch.argmin(scales, dim=1)
indices = torch.arange(min_scales.shape[0])
normal = rotations_mat[indices, :, min_scales]
Is there something wrong with my understanding?
@nyy618 I had a case where I had to warp these R matrices by motion. And I came to the conclusion that these R matrices are w2c
rotations for the gauss that transform from world coordinates to the gauss' local coordinates.
And for w2c
rotations the rows are the axis vectors viewed in the world coordinates.
My case is like this:
R'
to the model gives: R <- R * inv(R')
.I tried selecting columns, and couldn't make it work.
@initialneil Thank you for your inspiring clarification. I think the key point is the difference between rotation of world coordinates and rotation of the Gaussian in the world coordinates. The inverse of R is equal to the transpose of R since it is orthogonal. R is the rotation in world coordinates. The columns of R means how to represent the axis of Gaussian ellipsoid in world coordinates. However, if you want to transform from world coordinates to the Gaussian's local coordinates, you have to apply the inverse of R, namely the transpose of R. In theory, transition matrix from world coordinates basis to Gaussian coordinates basis is R, which means how to represent basis of Gaussian coordinates with the basis of world coordinates. Let the basis of world be e and the basis of Gaussian be e':
If you want to represent the basis of world with basis of Gaussian, you have to apply the inverse of R. BTW, the getWorld2View2
function also takes the transpose of Camera.R
as the rotation of w2c
matrix.
def getWorld2View2(R, t, translate=np.array([.0, .0, .0]), scale=1.0):
Rt = np.zeros((4, 4))
Rt[:3, :3] = R.transpose()
Rt[:3, 3] = t
Rt[3, 3] = 1.0
C2W = np.linalg.inv(Rt)
cam_center = C2W[:3, 3]
cam_center = (cam_center + translate) * scale
C2W[:3, 3] = cam_center
Rt = np.linalg.inv(C2W)
return np.float32(Rt)
Still I am not sure with my conclusion, I will refer to others for help. Hope you can point out my misunderstanding.
@nyy618 In the definition of camera projection, the R
is w2c: P_cam = R * P_world + t
So the original Camera.R
should be w2c
. But for the use of glm in the cuda code, the author of GS specifically stored camera's R
in transposed:
https://github.com/graphdeco-inria/gaussian-splatting/blob/472689c0dc70417448fb451bf529ae532d32c095/scene/dataset_readers.py#L196-L197
# get the world-to-camera transform and set R, T
w2c = np.linalg.inv(c2w)
R = np.transpose(w2c[:3,:3]) # R is stored transposed due to 'glm' in CUDA code
For the R of gauss, it seems that it's stored directly in w2c
, so the axis should be rows instead of columns.
@initialneil Thank you for your correction. I made a wrong example. Let the problem reduced to 2D Gauss. According to the paper, the covariance of the matrix is equal to RSS(T)R(T). For a particular problem: As you can see, the direction of the long axis is equal to the first column of the R and the direction of short axis is equal to the second. I think you should apply the transpose of R to rotate the coordinate. Is there something I missed?
@nyy618 I finally got some time to settle this question. I did some experiments and I think your math is correct. The normal is the columns instead of rows.
One GS in the eye of a camera with identity rotation matrix.
Setting one of the scaling to very small makes the gs to shrink
The shrinked edge is the last column of R if we set the last scaling to be small
Hi,
I also agree with @initialneil .
When I read this line, from my understanding, R
's column space is a transformation
from Gaussian to world system, and the shortest axis should be the first column, like below
x_axis = R_sorted[:,0,:] # normalized by defaut
should be --->
x_axis = R_sorted[:,:,0] # normalized by defaut
I also did an experiment on the horse_blender. The PSNR doesnot change much, so I assume the normal and normal_2 take the major effect in regressing the correct normal.
[ITER 30000] Evaluating test: L1 0.016133079305291176 PSNR 26.573974609375 [21/05 19:52:34]
[ITER 30000] Evaluating train: L1 0.010304585099220276 PSNR 29.289199829101562 [21/05 19:52:39]
@yinyunie Agree. For static scene here, the R is like a black box of parameters anyway. But it gets important when extending to dynamic scenarios. So better be fixed.
I've got an example here:
R_sorted[:,0,:]
selects the first rowSo
R_sorted[:,0,:]
should be changed toR_sorted[:,:,0]
?Why sorting the columns of R instead of rows?
After playing with the math, I believe that gather should work on rows instead of columns.After the discussion below, we(@nyy618 and me) think that the normal is got by selecting the column.
Pull request updated accordingly.