Some questions related to the paper

JIANG-CX commented 2 months ago

Impressive work! I have come across some questions while reading your paper. The paper states that the large Gaussians are observable from multiple viewpoints, but many of them are only partially covered near the boundaries. Is this occlusion caused by other small Gaussians obstructing the center? If so, why do the small Gaussians seem to have minimal influence from these occlusions? I would greatly appreciate it if you could provide clarification regarding these inquiries. Thank you for your time and assistance.

zhengzhang01 commented 2 months ago

Thank you for your interest in our work!

Our paper addresses the following issue: According to Formula 4 in the paper, during the initialization process of 3DGS, the radius of the Gaussian assigned to each point is determined based on the distance to the three nearest points. In regions where the initial Structure-from-Motion (SFM) point cloud is sparse, larger Gaussians are generated to fill these areas.

The original 3DGS determines whether a Gaussian should "split" or "clone" based on whether the average magnitude of the gradient in different Normalized Device Coordinates (NDC) views exceeds a threshold ( $\frac{\sum \|\mathbf{g}_i\|}{\sum 1} > \tau_{\mathrm{pos}}$ ).

We have found that this condition for point cloud growth does not work effectively for larger Gaussians, mainly for the following reason:

From a single viewpoint: Larger Gaussians experience severe gradient cancellation between different pixels within the same view. In contrast, even if smaller Gaussians undergo gradient cancellation, due to the mathematical differences in Gaussian distributions across pixels, the magnitude of gradients differs significantly, making the cancellation less severe.
From different viewpoints: Large Gaussians occupy more 3D space and thus involve more viewpoints in the same spatial location compared to smaller Gaussians. However, some of these viewpoints only contribute to the calculations at the edges. Due to the mathematical properties of Gaussian distributions and the occurrence of gradient cancellation, these viewpoints, despite having higher reconstruction errors, exhibit smaller gradient magnitudes. This significantly lowers the average gradient value, making it difficult to "split" or "clone" the Gaussian.

We address this issue by using a weighted average approach that prioritizes viewpoints with larger absolute errors and higher gradients. Specifically, due to the mathematical properties of Gaussian distributions, the gradient at a viewpoint for a Gaussian is primarily contributed by a few pixels near the center of the Gaussian. For larger Gaussians, when the central point is projected onto screen space, there are more pixels involved in the calculations at this viewpoint. Using a pixel count weighted average allows for prioritization of such viewpoints. For smaller Gaussians, the number of pixels involved in calculations does not vary significantly across different viewpoints, so the impact of using a pixel count weighted average on smaller Gaussians is minimal.

JIANG-CX commented 2 months ago

Thanks. Your reply has resolved my confusion.

zhengzhang01 / Pixel-GS

Some questions related to the paper #2