About the Visual Prompt

Hi, @Mountchicken

Suppose the batch size is set to 2.

In the first image, categories A and B have (N1_A) and (N1_B) instances, respectively. In the second image, categories A and C have (N2_A) and (N2_C) instances, respectively.

Based on your previous answer:

For category A in both the first and the second image, its visual prompt is selected randomly from 1 to (N1_A + N2_A);
In the first image, category C will be the negative prompts, sampled from 1 to N2_C;

My question:

In issue #85，is the sentence below not a precise description?：

during training, we generate prompts only within the same image, meaning that the embeddings for objects like dogs and cats are used only within the current image.
In batch training, if the first image contains instances of category C, but is not labeled, and only categories A and B are labeled(Non-exhaustive annotation). According to the above logic, query embedding of category C will be the negative prompts in the first image, will this be a problem?

IDEA-Research / T-Rex

About the Visual Prompt #87