Closed Pixie8888 closed 8 months ago
The overall aim of this code snippet is to define the Gaussian scales in image space rather than world space. Since the model only sees projected/image-space information, it would be difficult or impossible for the model to predict world-space Gaussian scales. Instead, it predicts image-space Gaussian scales, and these are multiplied by depth to yield world-space Gaussian scales. Multiplying by depth is necessary to counteract the fact that further away Gaussians appear smaller on the image plane.
Meanwhile, the multiplier
's purpose is to make the Gaussian minimum and maximum scales be defined in terms of pixel widths in the image plane. Specifically, the multiplier
is the coefficient that makes a Gaussian with scale 1 appear to be roughly 1 pixel wide on the image plane. Note that since Gaussians are fuzzy and don't have clear boundaries, this is only an approximate value.
Yes, it's distance along the ray. You can confirm this by looking at get_world_rays
in src/geometry/projection.py
, where you'll find this:
# normalize by ray length
directions = directions / directions.norm(dim=-1, keepdim=True)
If it were Z distance, the above line would be replaced with this:
# normalize by Z
directions = directions / directions[..., -1:]
Hope this helps!
Hi,
I have some questions about the
class GaussianAdapter
in gaussian_adapter.py.Why
scales
is multiplied withdepths
andmultiplier
? How to understandmultiplier
?The
depths
is ray depth, ie distance to the origin along the ray, but not the distance along the z-axis, right?