Question 1

The overall aim of this code snippet is to define the Gaussian scales in image space rather than world space. Since the model only sees projected/image-space information, it would be difficult or impossible for the model to predict world-space Gaussian scales. Instead, it predicts image-space Gaussian scales, and these are multiplied by depth to yield world-space Gaussian scales. Multiplying by depth is necessary to counteract the fact that further away Gaussians appear smaller on the image plane.

Meanwhile, the multiplier's purpose is to make the Gaussian minimum and maximum scales be defined in terms of pixel widths in the image plane. Specifically, the multiplier is the coefficient that makes a Gaussian with scale 1 appear to be roughly 1 pixel wide on the image plane. Note that since Gaussians are fuzzy and don't have clear boundaries, this is only an approximate value.

Question 2

Yes, it's distance along the ray. You can confirm this by looking at get_world_rays in src/geometry/projection.py, where you'll find this:

# normalize by ray length
directions = directions / directions.norm(dim=-1, keepdim=True)

If it were Z distance, the above line would be replaced with this:

# normalize by Z
directions = directions / directions[..., -1:]

Hope this helps!

dcharatan / pixelsplat

Some questions about the code #37

Question 1

Question 2