Closed Jackman-Xu closed 1 year ago
Probably the author is using the support class prototypes, which calculated by RoIAlign & GAP using the GT-bbox, to perform t-SNE. I seem to have reproduced the t-SNE experiment under the 10-shot setup, but not 2-shot setup.
Thanks for your great and inspiring work! The designed mechanism of attention enlightens me.
However, I still have some questions about the t-SNE visualization when I read the paper and codes. I wonder if the t-SNE visualization was made by using: (1) the category codes' features obtained by RoIAlign and GAP, or (2) the decoder's final output features (the correct predicted ones)