IDEA-Research / T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
https://deepdataspace.com/home
Other
1.98k stars 120 forks source link

bad result on unexpected prompt #8

Closed spacewalk01 closed 7 months ago

spacewalk01 commented 7 months ago

Thank you for your wonderful work. I tried your work on some images and got excellent result on common classes. However, I tried to prompt it with uncommon object like 'rock' but it detects wrong objects (common objects on the image): image

Mountchicken commented 7 months ago

Hi @spacewalk01 We also notice this potential over-fitting issue in T-Rex. When providing only one prompt, T-Rex tends to detect salient objects in the image and ignores the user's prompt. In such case you can provide additional prompts and T-Rex will understand user's intention, for most of the time. we are currently working on improving this. 20291701654562_ pic

spacewalk01 commented 7 months ago

After providing a few more prompts (feedbacks), it was able to fix its output. Nice work! image

spacewalk01 commented 7 months ago

I see. I wonder, so we can provide a few prompts on the first image, but what will happen to next images, should we do the same? I am glad to hear you are working on improving it. I tried on my own training images for detecting people. It did excellent work!

Mountchicken commented 7 months ago

It's hard to tell. In my experience, if the image is complex, i.e. with a lot of objects, then providing more prompts will bring you with higher accuracy.