IDEA-Research / Grounded-SAM-2

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
https://arxiv.org/abs/2401.14159
Apache License 2.0
1.01k stars 94 forks source link

Segmentation Another Open Vocabulary Model (Object Detection) Results #36

Closed sailfish009 closed 2 months ago

sailfish009 commented 2 months ago

Hello, I would like to detect objects using another open vocabulary based object detection model(DE-ViT: detect everyting) and segment them with the input. Is this possible and would be great to have an example.

devit

rentainhe commented 2 months ago

Hello, I would like to detect objects using another open vocabulary based object detection model(DE-ViT: detect everyting) and segment them with the input. Is this possible and would be great to have an example.

devit

Yes, I think both Grounded SAM and Grounded SAM 2 have the same idea that combining promptable segmentation model with detection model for detection and segmentation, it will also works with DE-ViT + SAM 2, but we want to keep our code more clean for community, I think it would be more convenient for you to implement this idea in your local environment.

sailfish009 commented 2 months ago

Okay, thanks for the answer.