IDEA-Research / T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
https://deepdataspace.com/blog/T-Rex
Other
2.28k stars 147 forks source link

The interactive visual prompt yielded unsatisfactory results. #71

Closed wangjl1993 closed 4 months ago

wangjl1993 commented 5 months ago

image

Is this a bug? always return the wrong boxes.

wangjl1993 commented 5 months ago

I found that "generic visual prompt" is better than "interactive visual prompt".
image image

Mountchicken commented 5 months ago

Hi @wangjl1993 The interactive visual prompt is for the case that the prompt image and target image are the same. When they are not, using generic visual prompt.

wangjl1993 commented 5 months ago

Hi @wangjl1993 The interactive visual prompt is for the case that the prompt image and target image are the same. When they are not, using generic visual prompt.

OK! thx~

fuweifu-vtoo commented 4 months ago

When the number of reference image is 1, and the reference image and the target image are different images, is there any difference between interactive visual prompt and generic visual prompt? I think in this case, IVP == GVP?