spla-tam / SplaTAM

SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM (CVPR 2024)
https://spla-tam.github.io/
BSD 3-Clause "New" or "Revised" License
1.58k stars 174 forks source link

Clarification Needed on Code Questions #122

Open AutoSenseTech opened 4 months ago

AutoSenseTech commented 4 months ago
WechatIMG4500

I have a question about the code:

  1. Regarding the Gaussian scale, why is the Gaussian scale set to depth/fx? What is the basis for this approach?
  2. The parameter mean_sq_dist_method is set to 'projective'. What is the purpose of using 'projective'?
  3. What is the function of variables['means2D'] in the code? Aren't we passing 3D coordinates for rendering? Why do we need to calculate the gradient of means2D?"

Let me know if you need further assistance or explanations.

Nik-V9 commented 2 months ago

Hi, the original 3DGS initializes the size of the Gaussians by using the 3D chamfer distance between points (which can be slow). Essentially, this initialization aims to make Gaussians near the camera smaller and make them larger as you move farther away from the camera.

Another fast and correct way of doing this is to use projective geometry, where the scale/size of Gaussian can be obtained by using the point's depth and the camera's focal length.

The means2D param computes the Gaussian's movement in the image space, which is then used to determine whether the Gaussian should be further split in the densification scheme. This is a parameter from the original 3D Gaussian Splatting. https://github.com/spla-tam/SplaTAM/blob/da6bbcd24c248dc884ac7f49d62e91b841b26ccc/utils/gs_external.py#L191