IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
https://arxiv.org/abs/2401.14159
Apache License 2.0
15.12k stars 1.4k forks source link

The text label on the visualization image is not complete #38

Open Harry-zzh opened 1 year ago

Harry-zzh commented 1 year ago

Hi, thank you for your excellent work.

When I run the Grounded-Segment-Anything demo, the text prompt that I use is "pottedplant"; however, the text label that appears on the resulting visualization image is "potted".

image

I wonder why it happens. Thanks!

SlongLiu commented 1 year ago

Thanks for your questions. The Grounding DINO is a grounding model, which means it detects objects from images and corresponding phrases from sentences. That may be caused by the confidence of the "plant" is not high. We suggest decrease the text_threshold in scripts.