Open brianw0924 opened 3 years ago
The idea is to optimize the embedding for cosine similarity
https://arxiv.org/pdf/1811.12649.pdf
", we describe below techniques we used to achieve SOTA on the retrieval tasks including L2 normalization of embedding to optimize for cosine similarity"
I know it have to normalize the weight
but why do we need this line:
x = F.normalize(x, p=2, dim=1)
why normalize feature?