W-Ted / GScream

Official code for ECCV2024 paper: GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal
61 stars 5 forks source link

questions about the method #6

Open chenj02 opened 1 month ago

chenj02 commented 1 month ago

Great work! But I have a few questions:

  1. From the point of view of the $Loss$ function, the results of the model essentially depend on the result of the 2D inpainting model (since the mask region is optimized by $M'_i$, which only used in $λ_1*M_0$), is it right?
  2. Since the $Loss$ function only optimizes the mask region of in-painted image (all image are used in to opimize the render quality of this view), how do you ensure the quality of other views’ mask region?
W-Ted commented 1 month ago

Hi, @chenj02. Thank you for your interest in our GScream!

  1. Essentially, that's true. For the masked area, bidirectional cross-attention can help achieve smoother boundaries, but the reference view's RGBD is much more important for providing reference information.
  2. Considering 3D constraints, our bidirectional cross-attention module applies regularization to the masked area. From the perspective of 2D supervision, we attempted to use perceptual loss or an additional learned discriminator to constrain the masked area in other views, which resulted in some improvements. However, we opted not to employ these 2D constraints since we found that the existing pipeline is already adequate for producing favorable results.