Closed bhomass closed 1 month ago
The error you're encountering with the FilterOperator.IN
not working in Chromadb is because the _transform_chroma_filter_operator
function in the LlamaIndex ChromaDB integration does not support the IN
operator. This function translates standard metadata filter operators to ChromaDB specific specifications, and it currently lacks support for the IN
operator, leading to the ValueError
you're seeing.
To address this issue, you could modify the _transform_chroma_filter_operator
function to include support for the IN
operator, assuming you're familiar with ChromaDB's query language. Here's an example modification:
def _transform_chroma_filter_operator(operator: str) -> str:
if operator == "IN":
return "$in"
# Include other operators and their translations
This modification assumes ChromaDB uses a syntax similar to MongoDB for the IN
operator. You'll also need to ensure your code correctly handles cases where filter.value
is a list of values, as the IN
operator applies to multiple values.
Keep in mind, this is a workaround. For a permanent solution, consider reaching out to the LlamaIndex team to request official support for the IN
operator in their ChromaDB integration.
As the error states, it is indeed not supported
If chroma does support this, it should be added here https://github.com/run-llama/llama_index/blob/f1124cffd4ba59ebc58839024fa8aaefef679214/llama-index-integrations/vector_stores/llama-index-vector-stores-chroma/llama_index/vector_stores/chroma/base.py#L37
@logan-markewich It seems the 'IN' operator is supported in chromadb. https://docs.trychroma.com/guides#using-inclusion-operators-(-and-) Could you please check this?
@logan-markewich This issue can be closed now https://github.com/run-llama/llama_index/pull/14010
Bug Description
I created a vectorstore using Chromadb, and created a retriever from that. What I want to do is to retrieve both from the semanatic search and a meta filter. The filter works for EQ, but blows up for IN. Is IN suppose to work? I tried _to_chroma_filter(filters) call, but it seems that call has been deprecated.
There is nothing wrong with my documents or nodes because as I said the EQ filter does work.
Version
0.10.28
Steps to Reproduce
vector_store = ChromaVectorStore(chroma_collection=chroma_collection) storage_context = StorageContext.from_defaults(vector_store=vector_store) sentence_index = VectorStoreIndex(nodes, storage_context=storage_context)
desired_ids = ['10000032', '10000764']
filters = MetadataFilters( filters=[ MetadataFilter(key="subject_id", operator=FilterOperator.IN, value=desired_ids), ], ) retriever = sentence_index.as_retriever(filters=filters) retriever.retrieve("Find all patients")
the error is Unexpected exception formatting exception. Falling back to standard exception
Relevant Logs/Tracbacks