about the negative cosine proximity loss

taoyang1122 / pytorch-SimSiam

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

Apache License 2.0

78 stars 8 forks source link

about the negative cosine proximity loss #2

Open suzhenghang opened 3 years ago

suzhenghang commented 3 years ago

Hi @taoyang1122 , thanks for opening such good codes. What is the loss value at the end of training and could you share the loss curve?

taoyang1122 commented 3 years ago

The loss would be around -9.3~-9.4. There is a loss curve in the original paper (Fig.2), you can check that.

suzhenghang commented 3 years ago

Thanks. I did both of unsupervised pretraining (BatchSize512+SGD+100Epoch+CosineLR0.1+NegativeCosineProximity loss)and linear evaluation(BatchSize4096+LARS+100Epoch+CosineLR1.6+CrossEntropy loss), and finally got 68.0%. I found that Sycn-BatchNorm is critical. Without Sycn-BatchNorm, i could get 65.1% only. The training curve is as follow. simsiam simsiam_lincls

taoyang1122 commented 3 years ago

Yes, SyncBN seems to be critical. Thanks for the sharing.