VinAIResearch / Open3DIS

Open3DIS: Open-vocabulary 3D Instance Segmentation with 2D Mask Guidance (CVPR 2024)
https://open3dis.github.io/
BSD 3-Clause "New" or "Revised" License
80 stars 3 forks source link

GroundedDINO version #13

Closed Yebulabula closed 6 months ago

Yebulabula commented 6 months ago

Hi author,

In your paper, you claim that "For Grounded-SAM, we utilize the Swin-B Grounding DINO decoder". However, I found that you actually utilize the Swin-T Grounding DINO decoder. May I ask if we could safely lift the decoder to Swin-B. Thanks.

Best wishes, Ye Mao

PhucNDA commented 6 months ago

Hi @Yebulabula, Sorry for the mistake in the paper. I think it is safe to use Swin-B Grounding DINO decoder with the current source code. You can check the generated box || masks by turning this on: https://github.com/VinAIResearch/Open3DIS/blob/main/tools/grounding_2d.py#L300

Best.