IDEA-Research / T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
https://deepdataspace.com/blog/T-Rex
Other
2.28k stars 147 forks source link

the result is not good #36

Closed zzk88862 closed 8 months ago

zzk88862 commented 8 months ago

image

the result is not good, Did I do something wrong in my steps?

Mountchicken commented 8 months ago

Hi @zzk88862 Thanks for your concern. This is actually a common case. In generic visual prompt mode (cross image), you may need more than one image for visual prompting, since different images may have a large intra-class variation and you need to get this generic visual embedding through multiple visual examples from different images. For example, when using a different prompt image, it works better. 24891711543168_ pic_hd