Open CYBruce opened 5 days ago
@CYBruce Sorry for the late reply, Grounding DINO
will combine the text with confidence score larger than the text threshold for each box, this means we will meet some combined words in Grounding DINO
model, to avoid this, you can modify the local code refer to here: https://github.com/IDEA-Research/GroundingDINO?tab=readme-ov-file#arrow_forward-demo by specifying the phrases.
Following the clearly-written README, I implemented the model successfully. However, for my cases, I found some problems. I used the code
grounded_sam2_local_demo.py
and the prompt is"car . bike . people . parking sign . parking entrance sign ."
But the return json file give some ungiven classes such as "sign entrance sign", "entrance sign"(seems like combinations of prompt words). And sometimes, void class name "" is output.I want to ask if only classes in the prompt will be labeled by the model. If the question is true, where are the results like "sign entrance sign" coming from? Is this problem related to
BOX_THRESHOLD
andTEXT_THRESHOLD
parameters in the code?