SeanLee97 / AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
https://arxiv.org/abs/2309.12871
MIT License
398 stars 30 forks source link

How to choose distance in a retrieval system? #25

Closed Jinsns closed 6 months ago

Jinsns commented 6 months ago

When we use the AnglE to build a (faiss) vector store for retrieval, do we need to customize an distance function which is in accord with the final objective function? The default distance of faiss vector store is L2_distance, and it has an option to COSINE. Will the retrieval system perform well just with L2 or COSINE?

SeanLee97 commented 6 months ago

You can choose L2 or Cosine in your applications. Here is one faiss example using UAE: https://colab.research.google.com/drive/1WOYD6f8gb_wpkUm_57K8pEDgjlGJd6oB?usp=drive_link