langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.87k stars 13.98k forks source link

PGVector filtering operator $nin causes Error #21694

Open n1k8-dev opened 1 month ago

n1k8-dev commented 1 month ago

Checked other resources

Example Code

result = vectorstore.similarity_search_with_score(query, k=25, filter={ "$and": [ { "type": "News" }, { "city": { "$in": [ "New York", "Chicago"] } }, { "topic": { "$nin": [ "Sports", "Politics"] } } ] } )

Error Message and Stack Trace (if applicable)

result = vectorstore.similarity_search_with_score(query, k=25, filter={"$and": [{"type": "News"}, {"city": {"$in": ["New York", "Chicago"]}}, {"topic": {"$nin": ["Sports", "Politics"]}}]})
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/ng/workspace/dev/chatbot/venv/lib/python3.12/site-packages/langchain_community/vectorstores/pgvector.py", line 572, in similarity_search_with_score docs = self.similarity_search_with_score_by_vector( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/ng/workspace/dev/chatbot/venv/lib/python3.12/site-packages/langchain_community/vectorstores/pgvector.py", line 597, in similarity_search_with_score_by_vector results = self.query_collection(embedding=embedding, k=k, filter=filter) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/ng/workspace/dev/chatbot/venv/lib/python3.12/site-packages/langchain_community/vectorstores/pgvector.py", line 911, in query_collection filter_clauses = self._create_filter_clause(filter) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/ng/workspace/dev/chatbot/venv/lib/python3.12/site-packages/langchain_community/vectorstores/pgvector.py", line 845, in _create_filterclause and = [self._create_filter_clause(el) for el in value] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/ng/workspace/dev/chatbot/venv/lib/python3.12/site-packages/langchain_community/vectorstores/pgvector.py", line 837, in _create_filter_clause return self._handle_field_filter(key, filters[key]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/ng/workspace/dev/chatbot/app/engine/assistant.py", line 272, in _handle_field_filter return queriedfield.nin([str(val) for val in filtervalue]) ^^^^^^^^^^^^^^^^^^ File "/Users/ng/workspace/dev/chatbot/venv/lib/python3.12/site-packages/sqlalchemy/sql/elements.py", line 1498, in getattr raise AttributeError( AttributeError: Neither 'BinaryExpression' object nor 'Comparator' object has an attribute 'nin'. Did you mean: 'in_'?

Description

I am trying to do a vector store similarity search with PGVector using a not in ($nin) filter of the metadata. This raises a AttributeError.

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 23.4.0: Fri Mar 15 00:11:05 PDT 2024; root:xnu-10063.101.17~1/RELEASE_X86_64 Python Version: 3.12.3 (v3.12.3:f6650f9ad7, Apr 9 2024, 08:18:48) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information

langchain_core: 0.1.40 langchain: 0.1.14 langchain_community: 0.0.31 langsmith: 0.1.40 langchain_anthropic: 0.1.5 langchain_experimental: 0.0.56 langchain_openai: 0.1.1 langchain_text_splitters: 0.0.1

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph langserve

n1k8-dev commented 1 month ago

On further analysis SqlAlchemy does not have a supporting function nin_. It does however have notin and notin_.

Changing the nin to notin in line 707 in pgvector.py fixed the problem.

image

I noticed the same issue in the latest version of langchain and langchain_postgres