DonkeyShot21 / cassle

Official repository for the paper "Self-Supervised Models are Continual Learners" (CVPR 2022)
MIT License
117 stars 18 forks source link

Question about contrastive distillation loss #17

Open SkrighYZ opened 9 months ago

SkrighYZ commented 9 months ago

Hi,

I have a few questions about the simclr code.

  1. https://github.com/DonkeyShot21/cassle/blob/b5b0929c3b468cd41740a529d58e92ee4e6ace61/cassle/losses/simclr.py#L21 It seems that the predicted features (p) are not in the negatives, which is different from what's suggested in the paper (appendix B). I understand that you switch p and z here (for a symmetric loss?) https://github.com/DonkeyShot21/cassle/blob/b5b0929c3b468cd41740a529d58e92ee4e6ace61/cassle/distillers/contrastive.py#L65-L68 but there is still no comparisons between different samples in p.

  2. In the paper the distillation loss is applied to the two views independently. Based on the code above, does it mean that we should use them jointly to reproduce the result?

  3. https://github.com/DonkeyShot21/cassle/blob/b5b0929c3b468cd41740a529d58e92ee4e6ace61/cassle/losses/simclr.py#L30-L33 The four lines of code here seem to make logit_mask an all-ones matrix. In my understanding we should assign the diagonals to False. Am I missing something?

TIA