pointrix-project / msplat

A modular differential gaussian rasterization library.
Other
118 stars 4 forks source link

Easy to get needle-like Gaussians #6

Open jiahaolu97 opened 1 month ago

jiahaolu97 commented 1 month ago

I transfered the whole optimization and dataset/camera reading pipeline into this repo, and I found that with the same hyper-parameter setting as original 3DGS repo default, it will easily produce needle-like gaussians:

289bc80c7071a24624376da3d3a4a79

Could you check if there is any errors in kernel computation (likely the backward computation of scale)? I can share my transferral of other components (optimization pipeline, 3D dataset readers) here if needed (likely some time after NeurIPS ddl).

yGaoJiany commented 1 month ago

🤔 It's a little strange. I've also tried training NeRF Lego with msplat, but haven't found this problem.

jiahaolu97 commented 1 month ago

Emmm that's really weird. I carefully transfer the code from original 3DGS repo so the hyperparameter settings are the same ...

May I ask if there is any change in hyper-parameters in your case?

Since from my observation, using default hyper-parameters here will result fewer Gaussians (original: ~600 K gaussians, here: ~5K Gaussians ) - seems only a few of Gaussians reached the densification threshold and got densified.

yGaoJiany commented 1 month ago

Well, I will check this 🏗️. Please keep this issue open!

jiahaolu97 commented 1 month ago

Thank you! Glad to discuss more on this and provide potential assistance.

yGaoJiany commented 1 month ago

Well, I did a quick replacement based on the original repo of gaussian splatting (https://github.com/graphdeco-inria/gaussian-splatting). And it seems that msplat works just fine. Here is what I get at 7000 iteration. image

yGaoJiany commented 1 month ago

I replaced the function named as render in _gaussianrenderer/init.py with this one:

import msplat as ms

def render(viewpoint_camera, pc : GaussianModel, pipe, bg_color : torch.Tensor, scaling_modifier = 1.0, override_color = None):
    """
    Render the scene with msplat.

    Background tensor (bg_color) must be on GPU!
    """

    # print("Hello, msplat.")

    # tranform 3dgs to msplat
    position = pc.get_xyz
    fovx = viewpoint_camera.FoVx
    fovy = viewpoint_camera.FoVy
    width = int(viewpoint_camera.image_width)
    height = int(viewpoint_camera.image_height)

    fx = fov2focal(fovx, width)
    fy = fov2focal(fovy, height)
    cx = float(width) / 2
    cy = float(height) / 2

    intrinsic_params = torch.tensor([fx, fy, cx, cy]).cuda().float()
    extrinsic_matrix = viewpoint_camera.world_view_transform.transpose(0, 1)
    extrinsic_matrix = extrinsic_matrix[:3, :]
    camera_center = viewpoint_camera.camera_center

    opacity = pc.get_opacity
    shs = pc.get_features.permute(0, 2, 1)
    scaling = pc.get_scaling
    rotation = pc.get_rotation

    # project points and perform culling
    (uv, depth) = ms.project_point(
        position,
        intrinsic_params,
        extrinsic_matrix,
        width, height)

    visible = depth != 0

    # compute sh if not None
    direction = (position -
                camera_center.repeat(position.shape[0], 1))
    direction = direction / direction.norm(dim=1, keepdim=True)

    sh2rgb = ms.compute_sh(shs, direction, visible)
    rgb = torch.clamp_min(sh2rgb + 0.5, 0.0)

    # compute cov3d
    cov3d = ms.compute_cov3d(scaling, rotation, visible)

    # ewa project
    (conic, radius, tiles_touched) = ms.ewa_project(
        position,
        cov3d,
        intrinsic_params,
        extrinsic_matrix,
        uv,
        width,
        height,
        visible
    )

    # sort
    (gaussian_ids_sorted, tile_range) = ms.sort_gaussian(
        uv, depth, width, height, radius, tiles_touched
    )

    # render
    ndc = torch.zeros_like(uv, requires_grad=True)
    try:
        ndc.retain_grad()
    except:
        raise ValueError("ndc does not have grad")

    # alpha blending
    render = ms.alpha_blending(
        uv, conic, opacity, rgb,
        gaussian_ids_sorted, tile_range, bg_color[0].item(), width, height, ndc
    )

    return {"render": render,
            "viewspace_points": ndc,
            "visibility_filter" : radius > 0,
            "radii": radius}
yGaoJiany commented 1 month ago

Here's what I got with msplat

[ITER 30000] Evaluating test: L1 0.004826885849470273 PSNR 36.27218983650208 [17/05 16:02:35]
[ITER 30000] Evaluating train: L1 0.0029831654857844117 PSNR 40.72242660522461 [17/05 16:02:35]

The number of points is 345,685.

yGaoJiany commented 1 month ago

Here is the report for original 3DGS.

[ITER 30000] Evaluating test: L1 0.004817731926450506 PSNR 36.26937942504883 [17/05 16:18:35]
[ITER 30000] Evaluating train: L1 0.0029933738056570295 PSNR 40.61189193725586 [17/05 16:18:35]

The number of point is 343,412.

It seems that msplat is a little bit slower. For Lego, msplat costs 10min12s for 30000 iteration while the original kernel costs 7min13s.

jiahaolu97 commented 1 month ago

Thank you @yGaoJiany for providing your results!

Let me check on my codes. I will get back to this issue after some time to report where my bugs are and hope it can help more users.

jiahaolu97 commented 1 month ago

@Yuhuoo I don't think we have the same problem, since I tested for the same data between msplat and the Inria repo ... Using 100 or 300 camera views both produce needle like Gaussians in my implementation with msplat, however in the Inria repo I didn't meet the same problem (either 100 images or 300 images).

Given the information from @yGaoJiany, I assume there's maybe some bugs in my code, not with their cuda kernel. I need to have some time to check on this though.

yGaoJiany commented 1 month ago

@Yuhuoo Maybe you can start a new issue about overfitting in a setting of fewer images and we can discuss this topic there.