jshtok / RepMet

Few-shot detection for visual categories
Apache License 2.0
111 stars 18 forks source link

Question about the learned representative #34

Open andrearosasco opened 2 years ago

andrearosasco commented 2 years ago

Hi @jshtok, thanks for the great paper. I just read it and there is something I did not quite understand. So, basically, when the training starts, the representatives of each class are just the random output of a fully connected layer and, over time, they are learned in such a way that the closest to the input belongs to the correct class.

This method, paired with the two losses, ensures that the DML module learns to map examples of the same class closer together.

While testing on unseen classes, the representatives are thrown away and in their place, we use the embeddings (outputs of the DML module) of some examples for each class.

Did I understand it correctly? My question is: why didn't you use actual examples during training? Couldn't learning the representatives and the distance metric together hurt the performance on unseen examples?