[Question]: responses from chromadb as vector store and qdrant are different

I found a similar closed issue that might be relevant to your question. According to this issue, the default similarity metric is cosine similarity, and it is hardcoded in the code. You can find where it is defined in the llama_index/embeddings/base.py file at line 33.

Regarding your specific questions:

Different Responses from ChromaDB and Qdrant: Yes, it is possible for the responses from ChromaDB and Qdrant to be different due to the similarity metric used. The similarity metric affects the ranking of the nodes based on their similarity scores.

Changing the Similarity Metric in Qdrant: The similarity metric in the Qdrant codebase is defined or configured in the search_batch and search methods of the QdrantVectorStore class. Specifically, it is configured using the limit parameter, which is set to query.similarity_top_k. Here is the relevant code snippet:

response = self._client.search_batch(
   collection_name=self.collection_name,
   requests=[
       rest.SearchRequest(
           vector=rest.NamedVector(
               name=DENSE_VECTOR_NAME,
               vector=query_embedding,
           ),
           limit=query.similarity_top_k,
           filter=query_filter,
           with_payload=True,
       ),
   ],
)

And for the asynchronous method:

response = await self._aclient.search_batch(
   collection_name=self.collection_name,
   requests=[
       rest.SearchRequest(
           vector=rest.NamedVector(
               name=DENSE_VECTOR_NAME,
               vector=query_embedding,
           ),
           limit=query.similarity_top_k,
           filter=query_filter,
           with_payload=True,
       ),
   ],
)

Parameter to Check the Similarity Metric: There isn't a direct parameter to check which similarity metric is being used, but you can infer it from the configuration and the methods being used in the code.

For more details, you can refer to the relevant sections in the Qdrant codebase ^[1].

To continue talking to Dosu, mention @dosu.

run-llama / llama_index

[Question]: responses from chromadb as vector store and qdrant are different #15997

Question Validation

Question