Closed fuweifu-vtoo closed 3 weeks ago
Hi @fuweifu-vtoo Each visual prompt embeddings can only come from on category
For instance, if we consider a batch size of 2:
For each category in the first image, we randomly select between 1 to (N) instances to form the visual prompt embeddings. Therefore, for the first image, we will have three visual prompt embeddings corresponding to categories A, B, and C.
Similarly, for the second image, we will have three visual prompt embeddings corresponding to categories D, E, and F.
In symbolic form:
For Image 1:
For Image 2:
Got it. Thanks.
In your paper, you mentioned: we randomly choose between one to all available GT boxes to use as visual prompts.
Could the visual prompts selected here be from different categories?
Or do the visual prompts of Trex2 have to come from the same category?