When running the following on a Chroma database (where query_embeddings is a list of embeddings around 20 elements long, and embeddings have length 1024, and n_results is about 8000):
I get sqlite3.OperationalError: too many SQL variables:
This error goes away if the number of embeddings in the list is reduced to 1. An alternative would be to iterate over querying for each embedding in the list, but this is extremely slow.
We can see that the error occurs here in segmentation.py, and is the result of not chunking this query to the database. I have tested the same scenario after changing the implementation in segment.py to use chunking and everything runs fine and is quite fast. I will create a PR with this resolution.
What happened?
Possible duplicate of #1861
When running the following on a Chroma database (where
query_embeddings
is a list of embeddings around 20 elements long, and embeddings have length 1024, andn_results
is about 8000):I get
sqlite3.OperationalError: too many SQL variables
:This error goes away if the number of embeddings in the list is reduced to 1. An alternative would be to iterate over querying for each embedding in the list, but this is extremely slow.
We can see that the error occurs here in segmentation.py, and is the result of not chunking this query to the database. I have tested the same scenario after changing the implementation in segment.py to use chunking and everything runs fine and is quite fast. I will create a PR with this resolution.
Versions
Chroma v0.5.5, Python 3.9.2, Debian 11
Relevant log output
No response