graphdeco-inria / gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/
Other
12.46k stars 1.55k forks source link

about projection_matrix #671

Open windkiss5 opened 4 months ago

windkiss5 commented 4 months ago

Hello author, thank you very much for your work. The code is very concise and elegant. I have a small question to ask you about projection matrices. Most of the camera reference matrices I come into contact with are like this(not like opengl's) : [[f_x, 0, u_0], [0, f_y, v_0], [0, 0, 1]]. 图片

If I manually specify a 'zfar', refer to your code in https://github.com/graphdeco-inria/gaussian-splatting/blob/d9fad7b3450bf4bd29316315032d57157e23a515/utils/graphics_utils.py#L51

can I directly convert it to

[2.0 * znear/W, 0.0, 0.0, 0.0]
[0.0, 2.0 * znear/H, 0.0, 0.0],
[0.0, 0.0, f/(f - n), -f * n/(f - n)]

?

So, I understand 'znear' as focal length. Is this process correct?

ShaohuaL commented 4 months ago

getProjectionMatrix is used to project the 3D point from the camera view space to NDC space,maybe you can refer to https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix.html to learn more

fzhiheng commented 2 months ago

getProjectionMatrix is used to project the 3D point from the camera view space to NDC space,maybe you can refer to https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix.html to learn more

I have one question: why is the formula provided here for calculating the projection matrix K different from the matrix given by OpenGL? Image 1 Image 2 The left image is from getProjectionMatrix, and the right image is from OpenGL. Looking forward to your response!

ShaohuaL commented 2 months ago

getProjectionMatrix is used to project the 3D point from the camera view space to NDC space,maybe you can refer to https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix.html to learn more

I have one question: why is the formula provided here for calculating the projection matrix K different from the matrix given by OpenGL? Image 1 Image 2 The left image is from getProjectionMatrix, and the right image is from OpenGL. Looking forward to your response!

Hello, because they didn't refer to the setting of OpenGL, may be you can refer to https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix.html

fzhiheng commented 2 months ago

@ShaohuaL Thanks for your response! I have read the link you provided. The main difference is the intervals for z mapping. I have taken the link to use 0-1 mapping, but there are still some differences with the final result. Here are my result: image I have tested my projection_matrix by cropping image to render. It works while the matrix in code fails.

def getProjectionMatrix2(znear, zfar, K, W, H):
    fx = K[0, 0]
    fy = K[1, 1]
    cx = K[0, 2]
    cy = K[1, 2]
    top = znear * cy / fy
    bottom = -znear * (H - cy) / fy
    right = znear * (W - cx) / fx
    left = -znear * cx / fx

    P = torch.zeros(4, 4)
    z_sign = 1.0

    P[0, 0] = 2.0 * znear / (right - left)
    P[1, 1] = 2.0 * znear / (top - bottom)
    P[0, 2] = -(right + left) / (right - left)
    P[1, 2] = (top + bottom) / (top - bottom)
    P[3, 2] = z_sign
    P[2, 2] = z_sign * zfar / (zfar - znear)
    P[2, 3] = -(zfar * znear) / (zfar - znear)

    return P

There is no problem in the source code because left = -right causes right + left to equal 0.

LiuJF1226 commented 2 months ago

@fzhiheng Hi! I think in your result, the element in (row 2, col 3) should also add a negative sign, i.e. -(t+b)/(t-b). Is that a typo?

fzhiheng commented 2 months ago

@LiuJF1226 That's what it looks like. Note that the camera coordinate system used in the code is x-right, y-down, z-forward.

LiuJF1226 commented 2 months ago

@fzhiheng You are right. I didn't notice that in your code, top = znear * cy / fy and bottom = -znear * (H - cy) / fy. And my derivation is directly under the RDF camera coordinate system, where I set bottom = -znear * cy / fy and top = znear * (H - cy) / fy. Under this, the element in (row 2, col 3) should be -(t+b)/(t-b). Acctually both formulations are right.