LTH14 / rcg

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
MIT License
826 stars 39 forks source link

Moco encoder #37

Open A-Thorley opened 1 month ago

A-Thorley commented 1 month ago

Thanks for the great paper and code! I have a query about the moco v3 encoder- in the paper it mentions the latent representations are regularized on a hyper-sphere. I am fairly new to moco v3, can you confirm if this type of regularisation was done with the original moco v3 pretraining paper or is this something you added? I am assuming that such regularised latents are quite important, so for example if I were to replace the encoder with say a MAE encoder which to my knowledge does not regularize latents in any way, this might not work as well?

LTH14 commented 1 month ago

Thanks for your interest! Please check this paper https://arxiv.org/pdf/2005.10242. From its "uniformity" analysis, contrastive loss naturally regularizes the representations on a hypersphere. MAE encoder should also work, but it might need a stronger representation generator.