langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
89.62k stars 14.16k forks source link

Potential Bug in Retriever.py: Can't instantiate abstract class BaseRetriever with abstract method _get_relevant_documents #13624

Open abgulati opened 8 months ago

abgulati commented 8 months ago

System Info

LangChain Version: 0.0.339 Python version: 3.10.8 Windows 10 Enterprise 21H2

When creating a ConversationalRetrievalChain as follows:

CONVERSATION_RAG_CHAIN_WITH_SUMMARY_BUFFER = ConversationalRetrievalChain( combine_docs_chain=combine_docs_chain, memory=summary_memory, retriever=rag_retriever, question_generator=question_generator_chain )

With rag_retriever = VectorStoreRetrieverMemory(retriever=VECTOR_STORE.as_retriever())

The following error occurs:

" Traceback (most recent call last): File "C:\Users\om403f\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Users\om403f\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run self._target(*self._args, self._kwargs) File "C:\Users\om403f\Documents\Applied_Research\Deep_Learning\web_app\app.py", line 1438, in llm_task CONVERSATION_RAG_CHAIN_WITH_SUMMARY_BUFFER = ConversationalRetrievalChain( File "C:\Users\om403f\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\load\serializable.py", line 97, in init super().init(kwargs) File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init pydantic.error_wrappers.ValidationError: 1 validation error for ConversationalRetrievalChain retriever Can't instantiate abstract class BaseRetriever with abstract method _get_relevant_documents (type=type_error) "

Name mangling may be occurring as described here: https://stackoverflow.com/questions/31457855/cant-instantiate-abstract-class-with-abstract-methods

retriever.py implements the abstract method _get_relevant_documents: https://github.com/langchain-ai/langchain/blob/4eec47b19128fa168e58b9a218a9da049275f6ce/libs/langchain/langchain/schema/retriever.py#L136

Hope this is an accurate bug report and it helps! Apologies if this is in fact a dumb report and actually an error at my end.

Who can help?

No response

Information

Related Components

Reproduction

from langchain.memory import ConversationSummaryBufferMemory from langchain.memory import VectorStoreRetrieverMemory from langchain.prompts import PromptTemplate from langchain.vectorstores import Chroma from langchain.chains import LLMChain from langchain.chains import RetrievalQA from langchain.chains import ConversationChain from langchain.chains import StuffDocumentsChain from langchain.chains import ConversationalRetrievalChain

VECTOR_STORE = Chroma(persist_directory=VECTORDB_SBERT_FOLDER, embedding_function=HuggingFaceEmbeddings()) LLM = AzureChatOpenAI()

document_prompt = PromptTemplate( input_variables=["page_content"], template="{page_content}" )

document_variable_name = "context"

temp_StuffDocumentsChain_prompt = PromptTemplate.from_template( "Summarize this content: {context}" )

llm_chain_for_StuffDocumentsChain = LLMChain(llm=LLM, prompt=temp_StuffDocumentsChain_prompt)

combine_docs_chain = StuffDocumentsChain( llm_chain=llm_chain_for_StuffDocumentsChain, document_prompt=document_prompt, document_variable_name=document_variable_name )

summary_memory = ConversationSummaryBufferMemory(llm=LLM, max_token_limit=100)

retriever=VECTOR_STORE.as_retriever()

rag_retriever = VectorStoreRetrieverMemory(retriever=retriever)

temp_template = ( """ Combine the chat history and qustion into a standalone question: Chat history: {chat_history} question: {user_query} """ )

temp_prompt = PromptTemplate.from_template(temp_template)

question_generator_chain = LLMChain(llm=LLM, prompt=temp_prompt)

CONVERSATION_RAG_CHAIN_WITH_SUMMARY_BUFFER = ConversationalRetrievalChain( combine_docs_chain=combine_docs_chain, memory=summary_memory, retriever=rag_retriever, question_generator=question_generator_chain )

Expected behavior

Example code here works: https://api.python.langchain.com/en/latest/chains/langchain.chains.conversational_retrieval.base.ConversationalRetrievalChain.html#langchain.chains.conversational_retrieval.base.ConversationalRetrievalChain

dosubot[bot] commented 8 months ago

🤖

Hello @abgulati,

Nice to meet you! I'm here to help you navigate the LangChain repository. I can assist in troubleshooting bugs, answering questions, and guiding you on your journey to becoming a contributor. While you're waiting for a human maintainer, I'm here to help you as best I can.

Thanks for submitting this detailed issue. I'm currently reviewing it and will provide a full answer shortly.

dosubot[bot] commented 8 months ago

🤖

Hello @abgulati! I'm Dosu, a bot designed to help with issues, questions, and contributions related to the LangChain repository. I'll be assisting you while we wait for a human maintainer. Looking forward to finding a solution to your issue together!

Thank you for the detailed bug report, abgulati. The error you're encountering, "Can't instantiate abstract class BaseRetriever with abstract method _get_relevant_documents", is typically raised when a class that inherits from an abstract base class doesn't provide an implementation for one of the base class's abstract methods.

In your case, it seems like the VectorStoreRetrieverMemory class, which you're using to create rag_retriever, might not be providing an implementation for the _get_relevant_documents method of the BaseRetriever class.

Here's a brief explanation of the role of the _get_relevant_documents method in the BaseRetriever class:

@abstractmethod
def _get_relevant_documents(
    self, query: str, *, run_manager: CallbackManagerForRetrieverRun
) -> List[Document]:
    """Get documents relevant to a query.
    Args:
        query: String to find relevant documents for
        run_manager: The callbacks handler to use
    Returns:
        List of relevant documents
    """

This method is designed to retrieve documents that are relevant to a given query. The query is a string for which relevant documents are to be found. The run_manager is a callback handler that is used during the execution of the method. The method is expected to return a list of Document objects that are relevant to the query.

Based on similar issues in the LangChain repository, it seems like the error might be resolved by ensuring that the retriever objects you're using are instances of a class that is a subclass of BaseRetriever and that this subclass provides an implementation for the _get_relevant_documents method.

Here's an example of how your retriever might look:

class MyRetriever(BaseRetriever):
    def _get_relevant_documents(
        self, query: str, *, run_manager: CallbackManagerForRetrieverRun
    ) -> List[Document]:
        # Your implementation here
        pass

    async def _aget_relevant_documents(
        self, query: str, *, run_manager: AsyncCallbackManagerForRetrieverRun
    ) -> List[Document]:
        # Your implementation here
        pass

retriever = MyRetriever()

Please replace the pass statement with your own implementation of these methods.

If you're still encountering issues after making these changes, please provide more details about the retriever objects you're using.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

abgulati commented 8 months ago

If the VectorStoreRetrieverMemory class, used to create rag_retriever, might not be providing an implementation for the _get_relevant_documents method of the BaseRetriever class, it sounds like a LangChain issue: when invoking VectorStoreRetrieverMemory while specifying a vector_store as a retriever, I doubt I'm required to provide a definition for an abstract method of its base class!

As per API docs here, only the 'retriever' parameter is required when creating a VectorStoreRetrieverMemory object: https://api.python.langchain.com/en/latest/memory/langchain.memory.vectorstore.VectorStoreRetrieverMemory.html#langchain.memory.vectorstore.VectorStoreRetrieverMemory

MartinKolbAtWork commented 6 months ago

I have the same issue when creating an instance of ConversationalRetrievalChain and passing a "ChromaDB Retriever" ("chromadb.as_retriever(...)"). Also this retriever implementation of ChromaDB has an implementation of "_get_relevant_documents"

JoshanSai commented 5 months ago

Try upgrading your langchain to 0.1.2 or higher

dineshpiyasamara commented 3 months ago

Try upgrading your langchain to 0.1.2 or higher

This is works for me. Thanks.

CM0804 commented 2 months ago

Try upgrading your langchain to 0.1.2 or higher

thanks but I am getting same error even after upgrading, Any other suggestion? can you please put your environment package list "pip list" if possible?

Gpapidas commented 2 months ago

Getting the same error with langchain 0.1.20

@dosu-bot, any input?

rajharsh33 commented 1 month ago

Try upgrading your langchain to 0.1.2 or higher

Thanks, it worked!