donydchen / mvsplat

🌊 [ECCV'24] MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
https://donydchen.github.io/mvsplat
Other
498 stars 22 forks source link

About scale multiplier #27

Closed jjlinghu closed 1 month ago

jjlinghu commented 1 month ago

Hi, thanks for this great work. I don't understand why a multiplier generated by intrinsic and pixel_size is used for scales.

        scale_min = self.cfg.gaussian_scale_min
        scale_max = self.cfg.gaussian_scale_max
        scales = scale_min + (scale_max - scale_min) * scales.sigmoid()
        h, w = image_shape
        pixel_size = 1 / torch.tensor((w, h), dtype=torch.float32, device=device)
        multiplier = self.get_scale_multiplier(intrinsics, pixel_size)
        scales = scales * depths[..., None] * multiplier[..., None]

Is this to convert the scale factor from the image space to the camera space?

Thanks in advance!

donydchen commented 1 month ago

Hi @jjlinghu, the objective of scales = scales * depths[..., None] * multiplier[..., None] is to constrain the Gaussian scales in image space, since the model only sees projected information.

Overall, it aims to obtain smooth gradients for Gaussian scales to make the training more stable.

jjlinghu commented 1 month ago

Thanks for your reply. It helps me a lot!