Jieqianyu / SGN

Implementation of IEEE TIP 2024 paper - "Camera-based Semantic Scene Completion with Sparse Guidance Network"
Other
24 stars 3 forks source link

Questions about the roles of occupancy prediction and geometry guidance modules #9

Open SeaBird-Go opened 4 months ago

SeaBird-Go commented 4 months ago

Hello, thanks for sharing this wonderful work.

When I read the paper, I felt somewhat confused about the usages of the geometry guidance branch and the occupancy prediction. From my point, since you utilize binary occupancy to supervise the 3D features obtained from 2D features, the 3D volume features already have occupancy-aware ability.

In this case, why do you need to introduce the occupancy prediction branch by using another depth estimation network? It makes the whole model complicated I think. Could we apply a threshold on the 3D features to obtain the sparse voxel proposals?

Jieqianyu commented 4 days ago

Yes, applying a threshold on the 3D features can yield sparse voxel proposals; however, these proposals are not as accurate as those generated by depth estimation.

Jieqianyu commented 4 days ago

The primary purpose of geometry guidance for 3D features derived from 2D features is to incorporate coarse volumetric information without adding any extra computation during the inference stage.