当图中没有prompt提示的物品时，会把最大的物品当做prompt提示的物品

IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

https://arxiv.org/abs/2401.14159

Apache License 2.0

14.88k stars 1.38k forks source link

当图中没有prompt提示的物品时，会把最大的物品当做prompt提示的物品 #409

Open tarepanda1024 opened 10 months ago

tarepanda1024 commented 10 months ago

输入的prompt为sun /cat/ dog 等。图中圈出来的都有问题：

下面这个是原图： 91ac6a31f5c5354578133315dab708ed

tarepanda1024 commented 10 months ago

@rentainhe can you help me or give me some advice ？

rentainhe commented 10 months ago

@rentainhe can you help me or give me some advice ？

There does appear to be an issue with the control over counterexamples in the Grounding-DINO model. This may be due to the model's weights. It might be worth trying better weights to see if it alleviates such a problem.

tarepanda1024 commented 10 months ago

@rentainhe can you help me or give me some advice ？

There does appear to be an issue with the control over counterexamples in the Grounding-DINO model. This may be due to the model's weights. It might be worth trying better weights to see if it alleviates such a problem.

Thx, i will try with another model weight.

tarepanda1024 commented 10 months ago

@rentainhe can you help me or give me some advice ？

There does appear to be an issue with the control over counterexamples in the Grounding-DINO model. This may be due to the model's weights. It might be worth trying better weights to see if it alleviates such a problem.

Sorry, could you please confirm again if you are referring to replacing the model or adjusting the parameters in GroundingDINO_SwinB.cfg.py?

Need i change models blow or adjusting config?

tarepanda1024 commented 10 months ago

My text prompt is 1cat . photos blow are all recogize failed.

NormanBeta commented 10 months ago

我个人实践，在openset上用Grounding-DINO在上做开放目标检测，有些理解

text prompt尽量多测试，并且用地道英语（1cat我都不太能理解），一般框都挺准，但可能和text对不上
box thresh可以调高些，但text thresh过高会出现断词的现象
对于box占全图过大的case就过滤掉
加些启发式联合过滤，比如衣服一定有人脸
openset的zero short 不可避免地会有误检，只能说在大数据范围内准确率还有个60%多，剩下的还得double check

tarepanda1024 commented 10 months ago

我个人实践，在openset上用Grounding-DINO在上做开放目标检测，有些理解

text prompt尽量多测试，并且用地道英语（1cat我都不太能理解），一般框都挺准，但可能和text对不上

box thresh可以调高些，但text thresh过高会出现断词的现象

对于box占全图过大的case就过滤掉

加些启发式联合过滤，比如衣服一定有人脸

openset的zero short 不可避免地会有误检，只能说在大数据范围内准确率还有个60%多，剩下的还得double check

好的，感谢~