pointrix-project / msplat

A modular differential gaussian rasterization library.
Other
170 stars 10 forks source link

Easy to get needle-like Gaussians #6

Closed jiahaolu97 closed 4 months ago

jiahaolu97 commented 6 months ago

I transfered the whole optimization and dataset/camera reading pipeline into this repo, and I found that with the same hyper-parameter setting as original 3DGS repo default, it will easily produce needle-like gaussians:

289bc80c7071a24624376da3d3a4a79

Could you check if there is any errors in kernel computation (likely the backward computation of scale)? I can share my transferral of other components (optimization pipeline, 3D dataset readers) here if needed (likely some time after NeurIPS ddl).

yGaoJiany commented 6 months ago

🤔 It's a little strange. I've also tried training NeRF Lego with msplat, but haven't found this problem.

jiahaolu97 commented 6 months ago

Emmm that's really weird. I carefully transfer the code from original 3DGS repo so the hyperparameter settings are the same ...

May I ask if there is any change in hyper-parameters in your case?

Since from my observation, using default hyper-parameters here will result fewer Gaussians (original: ~600 K gaussians, here: ~5K Gaussians ) - seems only a few of Gaussians reached the densification threshold and got densified.

yGaoJiany commented 6 months ago

Well, I will check this 🏗️. Please keep this issue open!

jiahaolu97 commented 6 months ago

Thank you! Glad to discuss more on this and provide potential assistance.

yGaoJiany commented 6 months ago

Well, I did a quick replacement based on the original repo of gaussian splatting (https://github.com/graphdeco-inria/gaussian-splatting). And it seems that msplat works just fine. Here is what I get at 7000 iteration. image

yGaoJiany commented 6 months ago

I replaced the function named as render in _gaussianrenderer/init.py with this one:

import msplat as ms

def render(viewpoint_camera, pc : GaussianModel, pipe, bg_color : torch.Tensor, scaling_modifier = 1.0, override_color = None):
    """
    Render the scene with msplat.

    Background tensor (bg_color) must be on GPU!
    """

    # print("Hello, msplat.")

    # tranform 3dgs to msplat
    position = pc.get_xyz
    fovx = viewpoint_camera.FoVx
    fovy = viewpoint_camera.FoVy
    width = int(viewpoint_camera.image_width)
    height = int(viewpoint_camera.image_height)

    fx = fov2focal(fovx, width)
    fy = fov2focal(fovy, height)
    cx = float(width) / 2
    cy = float(height) / 2

    intrinsic_params = torch.tensor([fx, fy, cx, cy]).cuda().float()
    extrinsic_matrix = viewpoint_camera.world_view_transform.transpose(0, 1)
    extrinsic_matrix = extrinsic_matrix[:3, :]
    camera_center = viewpoint_camera.camera_center

    opacity = pc.get_opacity
    shs = pc.get_features.permute(0, 2, 1)
    scaling = pc.get_scaling
    rotation = pc.get_rotation

    # project points and perform culling
    (uv, depth) = ms.project_point(
        position,
        intrinsic_params,
        extrinsic_matrix,
        width, height)

    visible = depth != 0

    # compute sh if not None
    direction = (position -
                camera_center.repeat(position.shape[0], 1))
    direction = direction / direction.norm(dim=1, keepdim=True)

    sh2rgb = ms.compute_sh(shs, direction, visible)
    rgb = torch.clamp_min(sh2rgb + 0.5, 0.0)

    # compute cov3d
    cov3d = ms.compute_cov3d(scaling, rotation, visible)

    # ewa project
    (conic, radius, tiles_touched) = ms.ewa_project(
        position,
        cov3d,
        intrinsic_params,
        extrinsic_matrix,
        uv,
        width,
        height,
        visible
    )

    # sort
    (gaussian_ids_sorted, tile_range) = ms.sort_gaussian(
        uv, depth, width, height, radius, tiles_touched
    )

    # render
    ndc = torch.zeros_like(uv, requires_grad=True)
    try:
        ndc.retain_grad()
    except:
        raise ValueError("ndc does not have grad")

    # alpha blending
    render = ms.alpha_blending(
        uv, conic, opacity, rgb,
        gaussian_ids_sorted, tile_range, bg_color[0].item(), width, height, ndc
    )

    return {"render": render,
            "viewspace_points": ndc,
            "visibility_filter" : radius > 0,
            "radii": radius}
yGaoJiany commented 6 months ago

Here's what I got with msplat

[ITER 30000] Evaluating test: L1 0.004826885849470273 PSNR 36.27218983650208 [17/05 16:02:35]
[ITER 30000] Evaluating train: L1 0.0029831654857844117 PSNR 40.72242660522461 [17/05 16:02:35]

The number of points is 345,685.

yGaoJiany commented 6 months ago

Here is the report for original 3DGS.

[ITER 30000] Evaluating test: L1 0.004817731926450506 PSNR 36.26937942504883 [17/05 16:18:35]
[ITER 30000] Evaluating train: L1 0.0029933738056570295 PSNR 40.61189193725586 [17/05 16:18:35]

The number of point is 343,412.

It seems that msplat is a little bit slower. For Lego, msplat costs 10min12s for 30000 iteration while the original kernel costs 7min13s.

jiahaolu97 commented 6 months ago

Thank you @yGaoJiany for providing your results!

Let me check on my codes. I will get back to this issue after some time to report where my bugs are and hope it can help more users.

jiahaolu97 commented 6 months ago

@Yuhuoo I don't think we have the same problem, since I tested for the same data between msplat and the Inria repo ... Using 100 or 300 camera views both produce needle like Gaussians in my implementation with msplat, however in the Inria repo I didn't meet the same problem (either 100 images or 300 images).

Given the information from @yGaoJiany, I assume there's maybe some bugs in my code, not with their cuda kernel. I need to have some time to check on this though.

yGaoJiany commented 6 months ago

@Yuhuoo Maybe you can start a new issue about overfitting in a setting of fewer images and we can discuss this topic there.