For symmetric semantic search your query and the entries in your corpus are of about the same length and have the same amount of content. An example would be searching for similar questions: Your query could for example be “How to learn Python online?” and you want to find an entry like “How to learn Python on the web?”. For symmetric tasks, you could potentially flip the query and the entries in your corpus.
For asymmetric semantic search, you usually have a short query (like a question or some keywords) and you want to find a longer paragraph answering the query. An example would be a query like “What is Python” and you wand to find the paragraph “Python is an interpreted, high-level and general-purpose programming language. Python’s design philosophy …”. For asymmetric tasks, flipping the query and the entries in your corpus usually does not make sense.
It is critical that you choose the right model for your type of task.
Suitable models for symmetric semantic search: Pre-Trained Sentence Embedding Models
Suitable models for asymmetric semantic search: Pre-Trained MS MARCO Models
why does symmetric vs asymmetric search make any difference as far as the model is concerned? why does the model care about this? I was thinking of using all-MiniLM-L6-v2 for an application but I will be doing asymmetric search on a knowledge base and this model falls under symmetric search, so should I not use it? what is a good model to generate embeddings for asymmetric search? i assume the same model should be used to generate the embedding for both the query and the documents - is that correct?
adding to above, on one hand the docs say that using correct model (symmetric vs asymmetric) is critical for the task. and on the other hand this example is provided as reference for doing semantic search on a KB and what model does it use? it uses multi-qa-MiniLM-L6-cos-v1 which is a symmetric search model as listed on symmetric search models page. what gives?
w.r.t. https://www.sbert.net/examples/applications/semantic-search/README.html A critical distinction for your setup is symmetric vs. asymmetric semantic search:
why does symmetric vs asymmetric search make any difference as far as the model is concerned? why does the model care about this? I was thinking of using
all-MiniLM-L6-v2
for an application but I will be doing asymmetric search on a knowledge base and this model falls under symmetric search, so should I not use it? what is a good model to generate embeddings for asymmetric search? i assume the same model should be used to generate the embedding for both the query and the documents - is that correct?adding to above, on one hand the docs say that using correct model (symmetric vs asymmetric) is critical for the task. and on the other hand this example is provided as reference for doing semantic search on a KB and what model does it use? it uses
multi-qa-MiniLM-L6-cos-v1
which is a symmetric search model as listed on symmetric search models page. what gives?