szymanowiczs / splatter-image

Official implementation of `Splatter Image: Ultra-Fast Single-View 3D Reconstruction' CVPR 2024
https://szymanowiczs.github.io/splatter-image
BSD 3-Clause "New" or "Revised" License
795 stars 54 forks source link

About the correctness of getProjectionMatrix in utils.graphics_utils.py #17

Closed thucz closed 6 months ago

thucz commented 7 months ago

Hi! I wonder which way of computing ProjectionMatrix is correct.

The current code is

def getProjectionMatrix(znear, zfar, fovX, fovY):
    tanHalfFovY = math.tan((fovY / 2))
    tanHalfFovX = math.tan((fovX / 2))

    top = tanHalfFovY * znear
    bottom = -top
    right = tanHalfFovX * znear
    left = -right

    P = torch.zeros(4, 4)

    z_sign = 1.0

    P[0, 0] = 2.0 * znear / (right - left) 
    P[1, 1] = 2.0 * znear / (top - bottom) 
    P[0, 2] = (right + left) / (right - left)
    P[1, 2] = (top + bottom) / (top - bottom) 
    P[3, 2] = z_sign 
    P[2, 2] = z_sign * zfar / (zfar - znear)
    P[2, 3] = -(zfar * znear) / (zfar - znear)
    return P

The way according to the materials I found (https://stackoverflow.com/questions/22064084/how-to-create-perspective-projection-matrix-given-focal-points-and-camera-princ, https://github.com/graphdeco-inria/gaussian-splatting/issues/399, https://gist.github.com/astraw/1341472?permalink_comment_id=61468#file-calib_test_utils-py-L67):

def getProjectionMatrix(znear, zfar, fovX, fovY):
    tanHalfFovY = math.tan((fovY / 2))
    tanHalfFovX = math.tan((fovX / 2))

    top = tanHalfFovY * znear
    bottom = -top
    right = tanHalfFovX * znear
    left = -right

    P = torch.zeros(4, 4)

    z_sign = 1.0

    P[0, 0] = 2.0 * znear / (right - left)  
    P[1, 1] = 2.0 * znear / (top - bottom) 
    P[0, 2] = (right + left) / (right - left) 
    P[1, 2] = (top + bottom) / (top - bottom) 
    P[3, 2] = z_sign 
    P[2, 2] = z_sign * (zfar + znear) / (zfar - znear)
    P[2, 3] = - 2 * zfar * znear / (zfar - znear)
    return P
thucz commented 7 months ago

Another question is why z_sign is 1.0 instead of -1.0?

szymanowiczs commented 6 months ago

z_sign is 1.0 instead of -1.0 because we use a camera convention where z is facing away from the camera (unlike OpenGL where z is facing into the camera).

The calibration matrix was taken directly from Gaussian Splatting repo to ensure compatibility with their rasteriser (see below). https://github.com/graphdeco-inria/gaussian-splatting/blob/d9fad7b3450bf4bd29316315032d57157e23a515/utils/graphics_utils.py#L51

Hope this clarifies it!

thucz commented 6 months ago

Thanks