Text Prompts w/ Multiple Object Categories: Concepts incorrectly combined

egeozguroglu commented 1 year ago

Hi, I've been using GroundingDINO + SAM for our research, and would like to query for multiple object categories for my usecase. e.g. "jug . onion . chair . toaster . wire . counter . glass . oil . potato . package ." (as suggested on this repo).

Unfortunately, when multiple object classes are added to the prompt as suggested, the GroundingDINO predictions get made with some categories combined. I was able to replicate the same error with your Huggingface Spaces Demo. See below.

Detection Prompt: "jug . onion . chair . toaster . wire . counter . glass . oil . potato . package ."

Input image:

Prediction Output:

In this case, glass and oil were combined into "glass oil," which is not desired behavior.

Would you have any insights on a quick solution? I will ultimately want to detect 300 object classes with one prompt, so resolving this is essential.

rentainhe commented 1 year ago

We will take a look into this problem recently

egeozguroglu commented 1 year ago

Thanks, please let me know! @rentainhe @SlongLiu This is especially problematic when prompting GroundingDINO + SAM with a set of vocabulary (305 nouns in our case). Many, many groups of nouns get tacked onto each other, e.g. "dough tofu potato peeler biscuit" become a group when we prompt the model for separate predictions with 305 nouns.

mhyeonsoo commented 1 year ago

I have a same issue with multi-class objects.

Please inform if there is any updates regarding this in the future :)

mcosano commented 1 year ago

Hello! :) Nice work! is there any update with this issue? I am currently having the same problem

IDEA-Research / Grounded-Segment-Anything

Text Prompts w/ Multiple Object Categories: Concepts incorrectly combined #217