Open XFeiF opened 2 years ago
They look at the self-attention of the [CLS] token on the heads of the last layer. This token is not attached to any label nor supervision. These maps show that the model automatically learns class-specific features leading to unsupervised object segmentations.
SSL segmentation magic -> SSL + Transformer's [CLS] token.
Paper
Code
Authors:
Mathilde Caron, Hugo Touvron, etc.
FBAI.
Highlights: