Closed thematrixduo closed 3 years ago
Hi! I noticed that in this paper you directly multiply the embedding vectors without normalizing them, as many of the recent self-supervised learning paper has done. Is there a specific reason for not doing the normalization? Thanks!
Hi,
We have not seen any improvements when performing cosine similarity with temperature instead of a simple dot product.
Thank you!
Hi! I noticed that in this paper you directly multiply the embedding vectors without normalizing them, as many of the recent self-supervised learning paper has done. Is there a specific reason for not doing the normalization? Thanks!