About scale multiplier - Githubissues

jjlinghu commented 1 month ago

Hi, thanks for this great work. I don't understand why a multiplier generated by intrinsic and pixel_size is used for scales.

        scale_min = self.cfg.gaussian_scale_min
        scale_max = self.cfg.gaussian_scale_max
        scales = scale_min + (scale_max - scale_min) * scales.sigmoid()
        h, w = image_shape
        pixel_size = 1 / torch.tensor((w, h), dtype=torch.float32, device=device)
        multiplier = self.get_scale_multiplier(intrinsics, pixel_size)
        scales = scales * depths[..., None] * multiplier[..., None]

Is this to convert the scale factor from the image space to the camera space?

Thanks in advance!

donydchen commented 1 month ago

Hi @jjlinghu, the objective of scales = scales * depths[..., None] * multiplier[..., None] is to constrain the Gaussian scales in image space, since the model only sees projected information.

Regarding multiplying by depths, further objects will be smaller when projected. Multiplying by the depths acts against this issue and ensures that the scales are roughly similar (for further away and closer objects) when unprojected back to 3D space.
Regarding multiplying by multiplier. This operation constrains the Gaussian scale concerning the pixel width in the image space, which aims to ensure that the Gaussian scale with scale 1 is roughly the same as 1 pixel in the image space.

Overall, it aims to obtain smooth gradients for Gaussian scales to make the training more stable.

jjlinghu commented 1 month ago

Thanks for your reply. It helps me a lot!

donydchen / mvsplat

About scale multiplier #27