IDEA-Research / GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
https://arxiv.org/abs/2303.05499
Apache License 2.0
6.54k stars 672 forks source link

REC functionality #249

Open MLRadfys opened 11 months ago

MLRadfys commented 11 months ago

Hi and thanks for you're awesome work on Grounding DINO. I mostly use Grounding DINO to label different object classes, which works really nicely.

Now I tried to experiment with the models REC functionality using the Swin-T checkpoint, though Iam not able to get it working properly. I tried both the image with the different colored cats and the image with the three lions.

E.g., I use the example prompt from the paper for the lion image "The left lion". This does not work at all. And this has been the experience for almost all images and prompts.

Now Iam wondering if Iam doing something wrong?

Thanks in advance,

kind regards,

M

SunShineAwayD commented 8 months ago

I met the same problem .

caseygauss commented 7 months ago

Try lowering your text_threshold to ~0.01