langchain-ai / langchain-postgres

LangChain abstractions backed by Postgres Backend
MIT License
134 stars 48 forks source link

$contain comperator not working with SelfQueryRetriever #87

Open imvaibhav28 opened 4 months ago

imvaibhav28 commented 4 months ago

Hi There,

I am getting error ValueError: Invalid operator: $contain. Expected one of {'$gte', '$ilike', '$and', '$ne', '$lt', '$eq', '$nin', '$exists', '$gt', '$in', '$between', '$lte', '$not', '$or', '$like'}

Query -> What are people talking about topicA in London with greater than 100 likes? PARAMS -> operator=<Operator.AND: 'and'> arguments=[Comparison(comparator=<Comparator.CONTAIN: 'contain'>, attribute='post_tags_internal', value='topicA'), Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='post_city', value='London'), Comparison(comparator=<Comparator.GT: 'gt'>, attribute='post_twitter_likes', value=100)]

I am using SelfQueryRetriever to fill in the md fields. However contains keyword is not supported by PGVector.

I checked the VectorStore from langchain-postgres and it looks like contain keyword has been deprecated.

Is there a workaround for this issue?

imvaibhav28 commented 4 months ago

I tried restricting the operators given in the error message while creating the retriever chain. Similar to this

but it doesn't work either/

eyurtsev commented 4 months ago

cc @pprados are you able to take a look at this?

eyurtsev commented 4 months ago

This operator is not currently implemented in the store. We should either drop it from the listed of supported operators or someone will need to add support for contains.

To add support for the contains operator, we need a definition of semantics for containment. Is it any different from $in or $like ?

imvaibhav28 commented 4 months ago

Hi @eyurtsev ,

thank you for your reply. If this support is dropped entirely from langchain-postgres, would it require manual handling of allowed_comparators for selfquery retriever going forward?

eyurtsev commented 4 months ago

How are you using contains? Is it different from like or in?

imvaibhav28 commented 3 months ago

Hi @eyurtsev ,

Here is my use case:

I use selfQuery retriever for my work as

SelfQueryRetriever.from_llm( llm, vector_store, document_content_description, metadata_field_info, verbose=True, use_original_query=True, fix_invalid=True, )

I am guessing that langchain core reframes a query using operators where 'contains' is a valid param name. But when the reframed query is sent to postgres (pg vector) for similarity search, it fails due to incompatibility.

HEre is the error log trace incase it helps.

`File "/Users/vu/Developer/agents/proj/src/proj/base_agents/langgraph_agent_artifacts/langgraph_v1.py", line 133, in retriever_node response = response = retriever.invoke(reframed_query) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain_core/retrievers.py", line 251, in invoke raise e File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain_core/retrievers.py", line 244, in invoke result = _get_relevant_documents( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain/retrievers/query/base.py", line 269, in _get_relevant_documents docs = _get_docs_with_query(new_query, search_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain/retrievers/query/base.py", line 243, in _get_docs_with_query docs = vectorstore.search(query, search_type, search_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain_core/vectorstores/base.py", line 337, in search return similarity_search(query, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain_postgres/vectorstores.py", line 896, in similarity_search return similarity_search_by_vector( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain_postgres/vectorstores.py", line 1450, in similarity_search_by_vector docs_and_scores = similarity_search_with_score_by_vector( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain_postgres/vectorstores.py", line 994, in similarity_search_with_score_by_vector results = query_collection(embedding=embedding, k=k, filter=filter) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain_postgres/vectorstores.py", line 1361, in query_collection filter_clauses = _create_filter_clause(filter) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain_postgres/vectorstores.py", line 1264, in _create_filter_clause return _handle_field_filter(key, filters[key]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/vu/Developer/agents/proj/.venv/lib/python3.12/site-packages/langchain_postgres/vectorstores.py", line 1072, in _handle_field_filter raise ValueError( ValueError: Invalid operator: $contain. Expected one of {'$not', '$eq', '$lte', '$gt', '$gte', '$lt', '$between', '$exists', '$in', '$and', '$ilike', '$ne', '$or', '$nin', '$like'} `