nerfstudio-project / gsplat

CUDA accelerated rasterization of gaussian splatting
https://docs.gsplat.studio/
Apache License 2.0
2.23k stars 284 forks source link

[Bug] Sparse Tensors leading to incorrect gradients post densification. #426

Closed rahul-goel closed 1 month ago

rahul-goel commented 1 month ago

I think that when packed=True is passed to the trainer, the gradient is wrong immediately after densification. The densification duplicates gaussians and prunes them leading to a change in the order and the count of the gaussians.

In this line, a sparse tensor is created with the densified gaussians. But the gaussian_ids have been calculated according to the pre-densification configuration.

This can also be verified by simply printing the maximum value of the gaussian_ids and the shape of the gaussian tensors post-densification. I can see that the maximum value of gaussian_ids is greater than the shape of the post-densification tensors.

(Pdb) print(self.splats[k].size())
torch.Size([52346, 3])
(Pdb) print(gaussian_ids.max())
tensor(54274, device='cuda:0')

Can someone also verify my thought process?

liruilong940607 commented 1 month ago

Ah good catch. I think a easy fix of this might be just moving the strategy.step_post_backward to after the optimizer.step?

https://github.com/nerfstudio-project/gsplat/blob/05829ef9ce4780a82e75314950f567be7ef857ad/examples/simple_trainer.py#L727-L746

to

https://github.com/nerfstudio-project/gsplat/blob/05829ef9ce4780a82e75314950f567be7ef857ad/examples/simple_trainer.py#L778