nerfstudio-project / gsplat

CUDA accelerated rasterization of gaussian splatting
https://docs.gsplat.studio/
Apache License 2.0
2.13k stars 271 forks source link

Issue with ProjectGaussians? #87

Closed saswat0 closed 10 months ago

saswat0 commented 10 months ago

I tried the plot the xys (2D gaussian centers) returned by ProjectGaussians and I noticed a discrepancy. As the training progresses, the density of points at (0, 0) increases and towards the end, all the gaussians are concentrated at (0,0) and none of them are spread across the entire image. Is this an expected behaviour?

means2d

maturk commented 10 months ago

@saswat0 this behaviour comes from the fact that the projected xys are initialised as a tensor of torch.zeros(num_points, 2) here and during projection if the gaussian center lies beyond the bounds of the image, the pixel location (and the 0,0 initialisation) is never updated for this particular gaussian here. So out of bounds gaussians end up being stored as (0,0) xys. This is my understanding. Let me know if you have any thoughts or concerns about the code.

I assume you are testing with the SimpleTrainer script? That is a cool gif/illustration. Btw there is no reason that the xys should become uniformly distributed over the image, it highly depends on the image you are trying to fit.

saswat0 commented 10 months ago

@maturk Thanks for the clarification Does that mean, more and more gaussians get out of bounds with increasing number of iterations? Yellow colour means that a huge number of centres are concentrated at (0,0)

Yes. I'm using the simple_trainer script. That's a really helpful representation of the entire workflow which was indeed a nightmare to understand. I'm attaching the image here. Should the gaussians not concentrate around the edges and high loss regions towards the end?

image

I'm trying to find the pixels which a particular gaussian contributes to. Is this the correct way of doing that?

maturk commented 10 months ago

@saswat0 I have noticed that during optimisation on a single frame the algorithm tends to discard a lot of the initialised gaussians by moving them out of bounds so they don't affect the final color. Also the SimpleTrainer really is just that, a very simple trainer and the learning rates and scheduling of the optimisable gaussian attributes have not been fine-tuned for best performance. I am sure you have noticed that using the script the end result might still be a bit blurry no matter how long you train. If you want better performance, you can consider adding some optimisation tricks like the following in the def train(...):

def train(self, iterations: int = 1000, lr: float = 0.01, save_imgs: bool = False):
        optimizer = optim.Adam(
            [
                {"params": self.means, "lr": lr},
                {"params": self.quats, "lr": lr},
                {"params": self.scales, "lr": lr},
                {"params": self.opacities, "lr": 2 * lr},
                {"params": self.rgbs, "lr": 2 * lr},
            ]
        )

        scheduler = torch.optim.lr_scheduler.ChainedScheduler(
            [
                torch.optim.lr_scheduler.LinearLR(
                    optimizer, start_factor=0.01, total_iters=10
                ),
                torch.optim.lr_scheduler.MultiStepLR(
                    optimizer,
                    milestones=[
                        iterations // 2,
                        iterations * 3 // 4,
                    ],
                    gamma=0.33,
                ),
            ]
        )

Yes xys are directly the uv coordinates of the projected centers of the gaussians. However, since each gaussian also has a scale, their influence also affects the nearby pixels it surrounds.

saswat0 commented 10 months ago

@maturk Thanks a lot for this! So is it correct to assume that a gaussian has the uv coordinates as its center and the radii as the radius? And all the pixels inside this circle are affected by that gaussian?

maturk commented 10 months ago

@saswat0 yes

saswat0 commented 10 months ago

@maturk Thank you I have one last question. Will this reflect in the authors' implementation as well?

maturk commented 10 months ago

@saswat0 what do you mean by that, I did not fully understand the question?

saswat0 commented 10 months ago

@maturk If I use the same constructs in the author's implementation (returning the point_xy_image and radii), would it mean the same thing (for finding the pixels that are affected by a gaussian)?

maturk commented 10 months ago

@saswat0 yes it looks like those correspond to the same variables as in gsplat. Remember that directly outputting the xy centers won't tell you enough about the final pixel color because you need to know the z_depth for alpha-compositing overlapping gaussians correctly (the front most gaussian has the biggest impact on the pic color for example). You should look into to the sorting and binning code which sorts the gaussians by depth into tiles (decomposing the input image into 16 by 16 grids) and indexing that is relatively fast for each pixel i,j coordinate.

saswat0 commented 10 months ago

@maturk I see. But if I follow it as is, the results seem unjustifiable (i.e., the splats are concentrated on an extremely small space around the origin). This is across different cameras and through the training

mine

saswat0 commented 10 months ago

This is what I get if I directly plot the 3D gaussians (by omitting the z coordinate). This looks convincing but the previous one doesn't. Also, the coordinate axes seem a bit off in both. They range from -300 to 200 in the 3D gaussians but come to -6000 to 6000 in their splats. Also, the pixel density of the plot is quite different. The density is too much in the splats but not so much in the 3D gaussian plot

real

maturk commented 10 months ago

@saswat0 are you now using nerfstudio or inria to train on multiple images with camera poses? Remember that in real scenes with moving cameras etc, most of the gaussians do not project onto the (H,W) image because they are unseen from the camera. So a scene with many millions of gaussians, perhaps the plots are justifiable.

saswat0 commented 10 months ago

This is while using inria

saswat0 commented 10 months ago

@maturk I didn't get this exactly. How is it that all of the gaussians are occupying a small region around (0,0)? What has it got to do with being or not being seen from the camera?

maturk commented 10 months ago

@saswat0, projecting 3D coordinates to pixel coordinates uses the projection matrix. So if some 3D coordinates range from [-200,200] in World xyz coordinates, and they are projected onto a single image in pixel coordinates, it is very much possible that the pixel coordinate bounds are way larger.

maturk commented 10 months ago

@saswat0 the xy coordinates are pixel coordinates on an image plane! They are not world coordinates.

saswat0 commented 10 months ago

@maturk Yes. Exactly. So the pixel coordinates of the 3D gaussian splats are concentrated only on very small regions near (0,0)

maturk commented 10 months ago

@saswat0, like I mentioned in my previous comment. At the start of projection, the xy coordinates of all gaussians that are projected to a frame are initialised as a tensor of (0,0)s. This is because we need to malloc Cuda memory apriori to projection. Now, during the projection and inside the projection function, the xy coordinates are filled in based on the perspective projection of 3D coordinates into 2D pixel coordinates. If some 3D coordinates fall out of bounds of the image, their (0,0) locations are never updated and they remain stored as (0,0). In your code, you should mask out xys ! = (0,0) and xys < (W,H) and only look at valid projections.

maturk commented 10 months ago

@saswat0 just mask your xys accordingly, something like this could work (I do not gurantee this is correct, just some code I slapped together):

xy_to_pix = torch.floor(xys).long()  # flooring, in the ideal perfect scenario, converts pixel xy projection [0.5, 0.5] to correct [0,0] uv coordinate
# note that > 0.0 values give valid depths
valid_indices = (
    (xy_to_pix[:, 0] > 0)
    & (xy_to_pix[:, 0] < W)
    & (xy_to_pix[:, 1] > 0)
    & (xy_to_pix[:, 1] < H)
)
xy_to_pix = xy_to_pix[valid_indices]

Now plot these xy_to_pix values which are integer pixel values, not floats like xys.

maturk commented 10 months ago

@maturk Yes. Exactly. So the pixel coordinates of the 3D gaussian splats are concentrated only on very small regions near (0,0)

Yes because they are initialised as (0,0) and never updated!

saswat0 commented 10 months ago

@maturk That did the trick. I was filtering on (H,W) but not on (0,0). Thanks a ton! You've been a HUGE help!

Btw, is the 2D splats plot has a lot of white spaces. Does that mean what it seems? That no gaussians are actually attending to those regions?

saswat0 commented 10 months ago

Also, is there any way to determine the limits of (H,W) for 2d splats? They don't seem to be same as the image and have negative coordinates as well. Should I assume (0,0) to be the image center?

maturk commented 10 months ago

@saswat0, you are only plotting the centers of the gaussians. in the real rendered image, the gaussians have a scale and the white spaces are filled with correct color. The (H,W) limits are the height and width of your image... (0,0) is top left corner because these are pixel coordinates. The xys that are returned are in pixel coordinates but they are floats instead of ints. Refer to this for the code

fkcptlst commented 7 months ago

code

Hi, could you please share the code of density calculation? I'm having trouble calculating histogram over point sets since it's too slow. Is there a nice way to implement this? Thanks!

saswat0 commented 7 months ago

@fkcptlst I use mpl_scatter_density for density plotting

fkcptlst commented 7 months ago

mpl_scatter_density

Thanks!