Closed TheSeriousProgrammer closed 10 months ago
The passage_embed
function adds the prefix "passage:". This is recommended usage when you're indexing a corpora into a vector store for instance.
This is expected by several embedding model implementations trained with contrastive losses. For instance, prefix is recommended by the BGE set of embedding models. BGE is default embedding we use as well in fastembed
.
Similarly, query_embed
function adds the "query:" prefix for usage at query time, when you want to retrieve or rank something using vector similarity
I hope this answers your question 🤞🏼
If it does not, please do re-open the issue!
I see that the code base has 2 methods one passage_embed and embed, but upon inspection of the code, I think that both are essentially the same, is there any difference between them. Or is it intended to add future features