aminebdj / OpenYOLO3D

Our OpenYOLO3D model achieves state-of-the-art performance in Open Vocabulary 3D Instance Segmentation on ScanNet200 and Replica datasets with up ∼16x speedup compared to the best existing method in literature.
50 stars 4 forks source link

Text-guided promptable segmentation #2

Closed Yebulabula closed 1 month ago

Yebulabula commented 1 month ago

Dear authors,

Thanks for your fantastic work. It is very beneficial to the community of the 3D OV segmentation research field. May I ask if OpenYOLO3D enables segmentation with text prompts similar to SAM in 2D?

Thanks.

Best wishes, Ye

aminebdj commented 1 month ago

Dear @Yebulabula,

Thanks a lot for your interest in our work,

Yes it is possible to prompt our model with text similar to 2D SAM, we will share the checkpoints for single 3D scene inference and data for evaluation in few days, so please keep an eye on the repository.

Many thanks, Mohamed

aminebdj commented 1 month ago

Dear @Yebulabula,

I updated the repo for single-scene inference,

You can infer from your 3D scene, which has to be structured similarly to the Replica sample in Data preparation, you can change the text prompt in confg.yaml file line 9 and run the following single scene inference code

Best, Mohamed

Yebulabula commented 1 month ago

Dear Mohamed,

Thanks for your response. It is very helpful, may I ask if your method leverage both 2D and 3D sgementator for promptable segmentation or only using 2D ones?

Thanks.

Best wishes, Ye

aminebdj commented 1 month ago

Hello @Yebulabula,

Our model uses only a 3D segmentor to extract 3D class agnostic instance masks. While the 2D open vocabulary object detector (yoloworld) is only used to prompt the class agnostic masks.

Best, Mohamed

Yebulabula commented 1 month ago

Hi Mohamed,

Thanks. I get it. I will close this issue. Your explanation is very clear.

Best, Ye