langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.02k stars 14.64k forks source link

ValueError: Self query retriever with Vector Store type <class 'langchain_pinecone.vectorstores.PineconeVectorStore'> not supported. #19418

Closed EdIzaguirre closed 2 days ago

EdIzaguirre commented 5 months ago

Checked other resources

Example Code

import os
from pinecone import Pinecone
from dotenv import load_dotenv
load_dotenv()

# Create empty index
PINECONE_KEY, PINECONE_INDEX_NAME = os.getenv(
    'PINECONE_API_KEY'), os.getenv('PINECONE_INDEX_NAME')

pc = Pinecone(api_key=PINECONE_KEY)

from langchain_core.documents import Document
from langchain_openai import OpenAIEmbeddings
from langchain_pinecone import PineconeVectorStore

embeddings = OpenAIEmbeddings()
# create new index
# pc.create_index(
#     name="film-bot-index",
#     dimension=1536,
#     metric="cosine",
#     spec=PodSpec(
#         environment="gcp-starter"
#     )
# )

# Target index and check status
index_name = "film-bot-index"
pc_index = pc.Index(index_name)
docs = [
    Document(
        page_content="A bunch of scientists bring back dinosaurs and mayhem breaks loose",
        metadata={"year": 1993, "rating": 7.7,
                  "genre": ["action", "science fiction"]},
    ),
    Document(
        page_content="Leo DiCaprio gets lost in a dream within a dream within a dream within a ...",
        metadata={"year": 2010, "director": "Christopher Nolan", "rating": 8.2},
    ),
    Document(
        page_content="A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea",
        metadata={"year": 2006, "director": "Satoshi Kon", "rating": 8.6},
    ),
    Document(
        page_content="A bunch of normal-sized women are supremely wholesome and some men pine after them",
        metadata={"year": 2019, "director": "Greta Gerwig", "rating": 8.3},
    ),
    Document(
        page_content="Toys come alive and have a blast doing so",
        metadata={"year": 1995, "genre": "animated"},
    ),
    Document(
        page_content="Three men walk into the Zone, three men walk out of the Zone",
        metadata={
            "year": 1979,
            "director": "Andrei Tarkovsky",
            "genre": ["science fiction", "thriller"],
            "rating": 9.9,
        },
    ),
]
vectorstore = PineconeVectorStore.from_documents(
    docs, embeddings, index_name=index_name
)

from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_openai import OpenAI

metadata_field_info = [
    AttributeInfo(
        name="genre",
        description="The genre of the movie",
        type="string or list[string]",
    ),
    AttributeInfo(
        name="year",
        description="The year the movie was released",
        type="integer",
    ),
    AttributeInfo(
        name="director",
        description="The name of the movie director",
        type="string",
    ),
    AttributeInfo(
        name="rating", description="A 1-10 rating for the movie", type="float"
    ),
]
document_content_description = "Brief summary of a movie"
llm = OpenAI(temperature=0)
retriever = SelfQueryRetriever.from_llm(
    llm, vectorstore, document_content_description, metadata_field_info, verbose=True
)

Error Message and Stack Trace (if applicable)


ValueError Traceback (most recent call last) Cell In[3], line 27 25 document_content_description = "Brief summary of a movie" 26 llm = OpenAI(temperature=0) ---> 27 retriever = SelfQueryRetriever.from_llm( 28 llm, vectorstore, document_content_description, metadata_field_info, verbose=True 29 )

File ~/miniconda3/envs/FilmBot/lib/python3.12/site-packages/langchain/retrievers/self_query/base.py:227, in SelfQueryRetriever.from_llm(cls, llm, vectorstore, document_contents, metadata_field_info, structured_query_translator, chain_kwargs, enable_limit, use_original_query, kwargs) 213 @classmethod 214 def from_llm( 215 cls, (...) 224 kwargs: Any, 225 ) -> "SelfQueryRetriever": 226 if structured_query_translator is None: --> 227 structured_query_translator = _get_builtin_translator(vectorstore) 228 chain_kwargs = chain_kwargs or {} 230 if ( 231 "allowed_comparators" not in chain_kwargs 232 and structured_query_translator.allowed_comparators is not None 233 ):

File ~/miniconda3/envs/FilmBot/lib/python3.12/site-packages/langchain/retrievers/self_query/base.py:101, in _get_builtin_translator(vectorstore) 98 except ImportError: 99 pass --> 101 raise ValueError( 102 f"Self query retriever with Vector Store type {vectorstore.class}" 103 f" not supported." 104 )

ValueError: Self query retriever with Vector Store type <class 'langchain_pinecone.vectorstores.PineconeVectorStore'> not supported.

Description

I am trying to create a self-querying retriever using the Pinecone database. The documentation makes it appear as though Pinecone is supported, but sadly it appears as though it is not. Fingers crossed support hasn't been pulled for Chroma DB as well. The code provided above is lightly modified from the documentation (see here).

System Info

langchain==0.1.13 langchain-community==0.0.29 langchain-core==0.1.33 langchain-experimental==0.0.54 langchain-openai==0.0.8 langchain-pinecone==0.0.3 langchain-text-splitters==0.0.1

Mac

Python Version 3.12.2

igalmarino commented 5 months ago

i had the same problem, but after changing:

from langchain_pinecone import PineconeVectorStore vectorstore = PineconeVectorStore(index_name=index_name, embedding=embeddings)

to:

from langchain.vectorstores import Pinecone vectorstore = Pinecone.from_existing_index(index_name=index_name, embedding = embeddings)

It is working fine.

alexminza commented 5 months ago

Support for PineconeVectorStore should be also added, as this is an official documented way to instantiate according to the example code: https://python.langchain.com/docs/integrations/vectorstores/pinecone/

ValueError: Self query retriever with Vector Store type <class 'langchain_pinecone.vectorstores.PineconeVectorStore'> not supported.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[5], [line 12](vscode-notebook-cell:?execution_count=5&line=12)
      ##REDACTED##

File [/opt/homebrew/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py:227](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py:227), in SelfQueryRetriever.from_llm(cls, llm, vectorstore, document_contents, metadata_field_info, structured_query_translator, chain_kwargs, enable_limit, use_original_query, **kwargs)
    [213](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py:213) @classmethod
    [214](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py:214) def from_llm(
    [215](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py:215)     cls,
   (...)
...
    [102](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py:102)     f"Self query retriever with Vector Store type {vectorstore.__class__}"
    [103](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py:103)     f" not supported."
    [104](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/langchain/retrievers/self_query/base.py:104) )

Reference:

csmizzle commented 4 months ago

Also hitting this issue. Confusing since its in the official docs

gardner commented 3 months ago

This is happening with the Chroma examples in the docs.

but this seems to work:

from langchain_community.vectorstores import Chroma

vectorstore = Chroma(
    client=client,
    collection_name=collection_name,
    embedding_function=embedding_function,
)
zhaobu commented 3 months ago

文档中的 Chroma 示例就发生了这种情况。

但这似乎有效:

from langchain_community.vectorstores import Chroma

vectorstore = Chroma(
    client=client,
    collection_name=collection_name,
    embedding_function=embedding_function,
)

Great, just adjust from langchain_chroma import Chroma to from langchain_community.vectorstores import Chroma, it work succesful

aviral49 commented 3 months ago

I am facing the same issue with PGVector also.

ValueError: Self query retriever with Vector Store type <class 'langchain_postgres.vectorstores.PGVector'> not supported.