langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.27k stars 14.74k forks source link

metaclass conflict error when trying to set up a custom retriever #17796

Closed dokato closed 3 months ago

dokato commented 6 months ago

Checked other resources

Example Code

The code I took comes from https://redis.com/blog/build-ecommerce-chatbot-with-redis/ blogpost.

import json
from langchain.schema import BaseRetriever
from langchain.vectorstores import VectorStore
from langchain.schema import Document
from pydantic import BaseModel

class RedisProductRetriever(BaseRetriever, BaseModel):
    vectorstore: VectorStore

    class Config:
        arbitrary_types_allowed = True

    def combine_metadata(self, doc) -> str:
        metadata = doc.metadata
        return (
           "Item Name: " + metadata["item_name"] + ". " +
           "Item Description: " + metadata["bullet_point"] + ". " +
           "Item Keywords: " + metadata["item_keywords"] + "."
        )

    def get_relevant_documents(self, query):
        docs = []
        for doc in self.vectorstore.similarity_search(query):
            content = self.combine_metadata(doc)
            docs.append(Document(
                page_content=content,
                metadata=doc.metadata
            ))

        return docs

Error Message and Stack Trace (if applicable)

on class definition it fail with:

TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

Description

I'm searching for a way to create a custom retriver but the instruction from the https://redis.com/blog/build-ecommerce-chatbot-with-redis/ blogpost doesn't work.

System Info

langchain==0.1.4
langchain-community==0.0.16
langchain-core==0.1.16
langchain-openai==0.0.5

Python 3.11.7 macOS

dosubot[bot] commented 6 months ago

🤖

Hey @dokato, good to see you back in the LangChain realm! Hope you're ready to tame another digital challenge. 🚀

It seems like you're encountering a metaclass conflict error due to multiple inheritance from BaseRetriever and BaseModel. This is a common issue in Python when a class inherits from two or more classes that have different metaclasses.

In your case, BaseRetriever and BaseModel have different metaclasses, hence the conflict. BaseRetriever is a part of LangChain and is designed to be the base class for all retrievers in the framework. On the other hand, BaseModel is a part of Pydantic, a data validation library, and is used to define data models.

A similar issue was reported in the LangChain repository: Retrievers inheriting from BaseRetriever are incompatible with typing.Protocol. The suggested solution was to use a Protocol to define the methods and attributes that a retriever should have, instead of using multiple inheritance.

Here's an example of how you can define a protocol for retrievers:

from typing import Protocol, List
from langchain.schema.document import Document
from langchain.callbacks.manager import CallbackManagerForRetrieverRun

class RetrieverProtocol(Protocol):
    def _get_relevant_documents(
        self, query: str, *, run_manager: CallbackManagerForRetrieverRun
    ) -> List[Document]:
        ...

    async def _aget_relevant_documents(
        self, query: str, *, run_manager: AsyncCallbackManagerForRetrieverRun
    ) -> List[Document]:
        ...

In this protocol, we've defined the _get_relevant_documents and _aget_relevant_documents methods that a retriever should have. Any class that is used as a retriever should implement these methods.

Then, instead of inheriting from BaseRetriever, your retriever classes would just need to implement the RetrieverProtocol. This way, you avoid the issues with multiple inheritance and reduce coupling in your code base.

Please note that this response is not meant to be a precise solution, but rather a starting point for your own research. If you have further questions or need more clarification, feel free to ask.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
dokato commented 6 months ago

Thanks, and how do I use this protocol to call get_relevant_documents, because when I try to instantiate it I'm getting:

TypeError: Protocols cannot be instantiated