Closed gich2009 closed 2 months ago
You are not missing any parameters when trying to run a pipeline with Qdrant as the vector store and enable_hybrid=True
. The setup for enabling hybrid search with Qdrant is correct as shown in the example:
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.qdrant import QdrantVectorStore
from qdrant_client import QdrantClient, AsyncQdrantClient
# creates a persistent index to disk
client = QdrantClient(host="localhost", port=6333)
aclient = AsyncQdrantClient(host="localhost", port=6333)
# create our vector store with hybrid indexing enabled
vector_store = QdrantVectorStore(
"llama2_paper",
client=client,
aclient=aclient,
enable_hybrid=True,
batch_size=20,
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
Settings.chunk_size = 512
index = VectorStoreIndex.from_documents(
documents,
storage_context=storage_context,
)
If you are encountering issues, it might be due to other factors not covered in the provided context. However, based on the given information, your setup appears to be correct [1].
Additionally, there are known issues or limitations documented for using Qdrant as a vector store with hybrid indexing in LlamaIndex. Specifically, if hybrid search is not enabled, attempting to perform a hybrid search will raise a ValueError
:
if query.mode == VectorStoreQueryMode.HYBRID and not self.enable_hybrid:
raise ValueError(
"Hybrid search is not enabled. Please build the query with "
"`enable_hybrid=True` in the constructor."
)
This indicates that hybrid search must be explicitly enabled during the construction of the query; otherwise, it will not function correctly [2][3].
The error you are encountering seems to be related to the convert_modifier
method in the qdrant_client
library, which raises a ValueError
for an invalid Modifier
model. This might be an issue with the Qdrant client library itself, and you may need to check the specific configurations or versions you are using.
To continue talking to Dosu, mention @dosu.
This is exactly what I am doing @dosubot
from llama_index.core import VectorStoreIndex, StorageContext from llama_index.vector_stores.qdrant import QdrantVectorStore from qdrant_client import QdrantClient, AsyncQdrantClient
client = QdrantClient(host="localhost", port=6333) aclient = AsyncQdrantClient(host="localhost", port=6333)
vector_store = QdrantVectorStore( collection_name=collection_name, client=client, aclient=aclient, enable_hybrid=True, batch_size=20, prefer_grpc=True, enable_hybrid=True, )
storage_context = StorageContext.from_defaults(vector_store=vector_store) Settings.chunk_size = 512
index = VectorStoreIndex.from_documents( documents, storage_context=storage_context, )
Has anyone else encountered this issue?
I think its a bug with the handling added to bm42
For whatever reason, qdrants enum for the modifier has a NONE
option, but apparently doesn't handle it? Very odd
sparse_config = self._sparse_config or rest.SparseVectorParams(
index=rest.SparseIndexParams(),
modifier=(
rest.Modifier.IDF
if self.fastembed_sparse_model and "bm42" in self.fastembed_sparse_model
else rest.Modifier.NONE
),
)
Actually I take it back, I wasn't able to replicate this in a fresh venv/google colab
Could be an issue with your qdrant client version? https://colab.research.google.com/drive/1TtsNPPLxLplvlvItrjBp7wE8oBrLLfgp?usp=sharing
Let me try again @logan-markewich
Hey @logan-markewich, it works for me too when I use an in memory client. Could you try using a managed Qdrant client?
The error occurs consistently for me when I use a managed client.
File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 260, in wrapper result = func(*args, **kwargs) File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/llama_index/core/ingestion/pipeline.py", line 555, in run self.vector_store.add(nodes_with_embeddings) File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/llama_index/vector_stores/qdrant/base.py", line 407, in add self._create_collection( File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/llama_index/vector_stores/qdrant/base.py", line 643, in _create_collection raise exc # noqa: TRY201 File "/home/gich2009/Work/BAYESNET/.venv/lib/python3.10/site-packages/llama_index/vector_stores/qdrant/base.py", line 616, in _create_collection self._client.create_collection( File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/qdrant_client.py", line 2081, in create_collection return self._client.create_collection( File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/qdrant_remote.py", line 2582, in create_collection sparse_vectors_config = RestToGrpc.convert_sparse_vector_config( File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3474, in convert_sparse_vector_config map=dict((key, cls.convert_sparse_vector_params(val)) for key, val in model.items()) File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3474, in map=dict((key, cls.convert_sparse_vector_params(val)) for key, val in model.items()) File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3465, in convert_sparse_vector_params cls.convert_modifier(model.modifier) if model.modifier is not None else None File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3454, in convert_modifier raise ValueError(f"invalid Modifier model: {model}") ValueError: invalid Modifier model: none
Do you need to bump the version of your deployed instance? Let me spin one up and try again
I'm already on the latest version according to the console.
v1.11.0
Hey @logan-markewich , I figured out what the issue is. The error occurs when prefer_grpc=True and enable_hybrid=True. enable_hybrid=True and prefer_grpc=False does not trigger the exception.
This is also true for a self hosted Qdrant container.
Not sure if it is the case that grpc cannot be used with hybrid indices.
hi @logan-markewich @gich2009 @joein I am also facing the issue when grpc = True , all is working fine with rest client but when i want to use prefer_grpc=True it gives error: using llama-index==0.10.67.post1 and qdrant client 1.11.0, sparse_vectors_config = RestToGrpc.convert_sparse_vector_config( File "/home/ubuntu/prashant/codex_genai/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3302, in convert_sparse_vector_config map=dict((key, cls.convert_sparse_vector_params(val)) for key, val in model.items()) File "/home/ubuntu/prashant/codex_genai/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3302, in map=dict((key, cls.convert_sparse_vector_params(val)) for key, val in model.items()) File "/home/ubuntu/prashant/codex_genai/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3293, in convert_sparse_vector_params cls.convert_modifier(model.modifier) if model.modifier is not None else None File "/home/ubuntu/prashant/codex_genai/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3282, in convert_modifier raise ValueError(f"invalid Modifier model: {model}") ValueError: invalid Modifier model: none
I think its a bug with the handling added to bm42
For whatever reason, qdrants enum for the modifier has a
NONE
option, but apparently doesn't handle it? Very oddsparse_config = self._sparse_config or rest.SparseVectorParams( index=rest.SparseIndexParams(), modifier=( rest.Modifier.IDF if self.fastembed_sparse_model and "bm42" in self.fastembed_sparse_model else rest.Modifier.NONE ), )
It's indeed looks like a bug in conversion of modifier
struct in qdrant-client
, we'll fix it
However, the code you provided for the modifier also contains a bug, since bm42
is not the only model which requires an IDF
modifier.
If it is supposed that fastembed
is installed, you can use qdrant_client.qdrant_fastembed.IDF_EMBEDDING_MODELS
for a check, otherwise, you might need to check the models manually, and then you also need to add bm25
.
It should also be possible just to replace rest.Modifier.NONE
with a regular None
, since modifier
is an optional field
cc: @Anush008
Question Validation
Question
I get this error when I try to run a pipeline with qdrant as the vector store and the enable_hybrid=True. Am I missing a parameter or is this a bug?
File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 260, in wrapper result = func(*args, **kwargs) File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/llama_index/core/ingestion/pipeline.py", line 555, in run self.vector_store.add(nodes_with_embeddings) File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/llama_index/vector_stores/qdrant/base.py", line 407, in add self._create_collection( File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/llama_index/vector_stores/qdrant/base.py", line 643, in _create_collection raise exc # noqa: TRY201 File "/home/gich2009/Work/BAYESNET/.venv/lib/python3.10/site-packages/llama_index/vector_stores/qdrant/base.py", line 616, in _create_collection self._client.create_collection( File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/qdrant_client.py", line 2081, in create_collection return self._client.create_collection( File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/qdrant_remote.py", line 2582, in create_collection sparse_vectors_config = RestToGrpc.convert_sparse_vector_config( File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3474, in convert_sparse_vector_config map=dict((key, cls.convert_sparse_vector_params(val)) for key, val in model.items()) File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3474, in
map=dict((key, cls.convert_sparse_vector_params(val)) for key, val in model.items())
File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3465, in convert_sparse_vector_params
cls.convert_modifier(model.modifier) if model.modifier is not None else None
File "/home/gich2009/Work/.venv/lib/python3.10/site-packages/qdrant_client/conversions/conversion.py", line 3454, in convert_modifier
raise ValueError(f"invalid Modifier model: {model}")
ValueError: invalid Modifier model: none