NVlabs / VoxFormer

Official PyTorch implementation of VoxFormer [CVPR 2023 Highlight]
Other
1.07k stars 87 forks source link

Where is voxel query? #17

Closed Sangmin-Bak closed 1 year ago

Sangmin-Bak commented 1 year ago

Thank you for your research and I have a few questions for you.

I don't know exactly where the voxel query resides. I am curious about the following points.

  1. Is a voxel query a voxelized point cloud? If so, is the voxel query provided in sequences_msnet3d_sweep10?
  2. Is the voxel query different from the binary voxel grid map (M_in) introduced in the paper?
  3. Where is the voxel query implemented in the code?
Abde951 commented 1 year ago

I am currently working on this paper, so i will try to explain what i have understood : 1- First of all, they extract the depth map using the mobile stereo net 3D (msnet3D) and project the resulted map points to have a 3D point cloud. Then build a binary grid M_in where if a voxel have just one point from the 3D point cloud estimated, the voxel get the value of 1. So it's sort of 3D occupancy grid.

2- Using this M_in map we can extract the voxels that are occupied, meaning the voxels that are not occluded (voxel queries).

3- the voxel queries that provided in sequences_msnet3d_sweep10, because they are preprocessed from the data (images). You can see that it has been loaded in VoxFormer/projects/mmdet3d_plugin/datasets/semantic_kitti_dataset_stage2.py on load_scan() method. And eventually will be used as input for the model.

RoboticsYimingLi commented 1 year ago

Thank @Abde951 for the explanations. M_in map is a voxelized pseudo point cloud used as input to query proposal network (QPN), and sequences_msnet3d_sweep10 is the output of QPN (taking the voxelization of the previous 10 pseudo sweeps as input) and is used as the query proposals in stage-2.