Effect of projection head

Frankluox / LightningFSL

LightningFSL: Pytorch-Lightning implementations of Few-Shot Learning models.

MIT License

107 stars 17 forks source link

Effect of projection head #14

Closed Jiawei-Yang closed 2 years ago

Jiawei-Yang commented 2 years ago

Hi authors, thanks for your wonderful work!

I have some questions regarding the projection head in MoCo/Exemplar. Have you experimented with the pre-trained MoCo/Exemplar with and without a nonlinear projection head? How did they perform? Also, in the code, you commented that Surprisingly, pure contrastive-pretrained model performs very well on Few-Shot Learning. Is there any possibility for you to remember the approximate performance gap, i.e., pure MoCo pre-trained v.s. Exemplar pre-trained?

Frankluox commented 2 years ago

Hi, although I have not experimented without a nonlinear projection head, I am almost sure that the performance of MoCo/Exemplar will drop a lot without a nonlinear projection head according to the original paper of SimCLR and MoCo. The performance of pure MoCo/Exemplar on miniImageNet is given in the table in the README.md. Please refer to it. Note that Exemplar is a weakly supervised method, and the performance gap is 7% for 1shot and 4% for 5shot. Recent experiments show that some non-contrastive unsupervised methods like DINO[1] perform even much better, especially for cross-domain few-shot learning from ImageNet towards datasets like ISIC, QuickDraw, etc.

[1] Emerging Properties in Self-Supervised Vision Transformers. ICCV 2021.

Jiawei-Yang commented 2 years ago

Thanks for your quick reply!

I missed the performance about MoCo in the README.md. Thanks for pointing that out.

PS: why would you define Exemplar as a weakly supervised method? Is it because that it does not explictly predict the class-label but use the label information to filter out the true negatives for contrasting?

Frankluox commented 2 years ago

Yes, that's exactly what I mean in an informal way, different from the traditional meaning of "weakly supervised". :)

Jiawei-Yang commented 2 years ago

Thank you for your fast responses and this discussion! xD