IDEA-Research / T-Rex

API for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
https://deepdataspace.com/home
Other
1.98k stars 120 forks source link

About visual prompt. #73

Closed hengseuer closed 4 days ago

hengseuer commented 5 days ago

Hi @Mountchicken ,

Suppose the batch size is set to 4.

In the first image, categories A and B have (N1_A) and (N1_B) instances, respectively. In the second image, categories C and D have (N2_C) and (N2_D) instances, respectively. In the third image, categories A and B have (N3_A) and (N3_B) instances, respectively. In the fourth image, categories C and D have (N4_C) and (N4_D) instances, respectively.

For category A in the first image, is its visual prompt selected randomly from 1 to N1, or from 1 to (N1 + N3)? If category C is the negative sample category for the first image, is its visual prompt selected randomly from 1 to N2 (N4), or from 1 to (N2 + N4)?

Mountchicken commented 4 days ago

Hi @hengseuer

hengseuer commented 4 days ago

Thanks. Understood.