Open kubni opened 18 hours ago
When I am searching the database the usual way (without DSPy) I do something like this:
with torch.no_grad():
batch_query = processor.process_queries([query_text]).to(model.device)
query_embedding = model(**batch_query)
multivector_query = (
query_embedding[0].cpu().float().tolist()
)
search_result = qdrant_client.query_points(
collection_name=collection_name,
query=multivector_query,
limit=top_k,
timeout=100,
search_params=models.SearchParams(
quantization=models.QuantizationSearchParams(
ignore=False,
rescore=True,
oversampling=2.0,
),
hnsw_ef=128,
exact=True,
),
)
Here, multivector_query
that is used for searching is a vector of 27 vectors of size 128.
Qdrant knows how to handle it since its configured to use MAX_SIM comparison.
I couldn't find a way to do something similar with DSPy so far.
I believe currently, DSPy assumes single-vector embeddings, which limits native multi-vector search. Future versions could potentially add multi-vector support by extending DSPy retrievers or submitting a feature request to the DSPyGen team.
In the meantime, you can integrate your custom logic using the options below
Option A: Modify DSPy to handle multi-vector embeddings. To make DSPy compatible with your setup, you can subclass the QdrantRM retriever model to handle multi-vector embeddings and pass the appropriate embedding structure into Qdrant for querying.
Option B: Preprocess Your Embeddings into Single Vectors If your use case does not strictly require multi-vector search, consider reducing your embeddings to a single vector.
Option C: Use Your Custom Search Logic with DSPy If neither of the above options works for you, you can bypass DSPy’s retriever and directly use your own Qdrant querying logic alongside DSPy’s LLM functionalities.
Thanks for the reply. I will try some of the ideas you proposed.
Hey @kubni , does this all need to happen in DSPy? Just do it normally. Use DSPy for the LM modules, not for the multi-vector search.
Okay, thanks for the heads up, @okhat . I am still new to DSPy, so I am trying to see how various things work and fit together. I will definitely try using the LM modules.
Hey @okhat. My RAG pipeline involves retrieving a Qdrant point which contains a base64 encoded image from the database, and afterwards I pass it to the VLM like 4o-mini. Is this possible with DSPy LM modules? I see some closed PRs but it doesn't say that they are merged?
@kubni Basically, you don't need DSPy features for the retrieval parts, since you want to use Qdrant or multi-vector serving or whatnot. Rely on DSPy only for the LM steps.
Hello. I used
ColQwen2
model from Colpali team to generate embeddings and store them into my Qdrant database. I wanted to try out DSPy, but I can't get it to retrieve the points from Qdrant:I get the following error:
I have tried changing the Vectorizer used but to no avail. Colqwen2 produces
multi-vector embeddings
, I am not sure if that is supported by DSPy. Any ideas?