LLMRouterChain uses deprecated predict_and_parse method

amosjyng commented 1 year ago

System Info

langchain v0.0.216, Python 3.11.3 on WSL2

Who can help?

@hwchase17

Information

[X] The official example notebooks/scripts
[ ] My own modified scripts

Related Components

[ ] LLMs/Chat Models
[ ] Embedding Models
[X] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[X] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

Follow the first example at https://python.langchain.com/docs/modules/chains/foundational/router

Expected behavior

This line gets triggered:

The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.

As suggested by the error, we can make the following code changes to pass the output parser directly to LLMChain by changing this line to this:

llm_chain = LLMChain(llm=llm, prompt=prompt, output_parser=prompt.output_parser)

And calling LLMChain.__call__ instead of LLMChain.predict_and_parse by changing these lines to this:

cast(
    Dict[str, Any],
    self.llm_chain(inputs, callbacks=callbacks),
)

Unfortunately, while this avoids the warning, it creates a new error:

ValueError: Missing some output keys: {'destination', 'next_inputs'}

because LLMChain currently assumes the existence of a single self.output_key and produces this as output:

{'text': {'destination': 'physics', 'next_inputs': {'input': 'What is black body radiation?'}}}

Even modifying that function to return the keys if the parsed output is a dict triggers the same error, but for the missing key of "text" instead. predict_and_parse avoids this fate by skipping output validation entirely.

It appears changes may have to be a bit more involved here if LLMRouterChain is to keep using LLMChain.

tvmaly commented 1 year ago

I am seeing the same issue on v0.0.221 , Python 3.10.6 Windows 10

AI-Chef commented 1 year ago

I have the same issue on v0.0.229, Python v3.10.12

liaokaime commented 1 year ago

I have the same issue on v0.0.230, Python v3.10.6 Windows 11

xzhang8g commented 1 year ago

same issue here

alexminza commented 1 year ago

langchain v0.0.232

[/opt/homebrew/lib/python3.11/site-packages/langchain/chains/llm.py:275](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/langchain/chains/llm.py:275): UserWarning: The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.

bbirdxr commented 1 year ago

langchain v0.0.235, Python 3.9.17 Windows 10

ellisxu commented 1 year ago

langchain v0.0.240, Python 3.10.10 macOS Ventura

mpearce-bain commented 1 year ago

langchain v0.0.244, Python 3.10.11 Windows 10

ZHANGJUN-OK commented 1 year ago

I am seeing the same issue on v0.0.257 Python 3.9.12 RedHat

prdy20 commented 1 year ago

same issue v0.0.270 python 3.11.3 windows

gtmray commented 1 year ago

Same warning on v0.0.275 python 3.11.3 WSL on Windows 11

ja4h3ad commented 1 year ago

I also get the same error when using this option:

print(compressed_docs[0].page_content)

If I remove the page_content method, I do not get this error. Also if I use this (CharacterTextSplitter), I do not see the error

text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=500)

Here is a sample function where this happens:

def demo(question):
    '''

    Follow the steps below to fill out this function:
    '''
    # PART ONE:

    loader = TextLoader('/map/to/document/data.txt', encoding='utf8')
    documents = loader.load()

    # PART TWO
    # Split the document into chunks (you choose how and what size)
    # text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=1000)

    text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=1000, chunk_overlap=100, separators=[" ", ",", "\n"])
    docs = text_splitter.split_documents(documents)

    # PART THREE
    # EMBED THE Documents (now in chunks) to a persisted ChromaDB
    embedding_function = OpenAIEmbeddings()
    db = Chroma.from_documents(docs, embedding_function, persist_directory='./App')
    db.persist

    # PART FOUR
    # Use ChatOpenAI and ContextualCompressionRetriever to return the most
    # relevant part of the documents.
    llm = ChatOpenAI(temperature=0)
    compressor = LLMChainExtractor.from_llm(llm)
    compression_retreiver = ContextualCompressionRetriever(base_compressor=compressor,
                                                          base_retriever=db.as_retriever())
    compressed_docs=compression_retreiver.get_relevant_documents(question)

    print(compressed_docs[0].page_content)

mikeymice commented 1 year ago

y'all any fix to it? What are we supposed to do

ton77v commented 1 year ago

Got the same warning while using load_qa_chain chain_type='map_rerank',

RoderickVM commented 1 year ago

I posted a very similar issue #10462, using SelfQueryRetriever. I received some sort of solution from the chatbot. However, I don't have a clue how to implement it.

ellisxu commented 1 year ago

I posted a very similar issue #10462, using SelfQueryRetriever. I received some sort of solution from the chatbot. However, I don't have a clue how to implement it.

I solved it by extending SelfQueryRetriever and overwriting the _get_relevant_documents method. Below is an example (In this example, I overwrite _aget_relevant_documents, because I need async features in my case. You can do the same to _get_relevant_documents. ):

class AsyncSelfQueryRetriever(SelfQueryRetriever):
    async def _aget_relevant_documents(
        self, query: str, *, run_manager: AsyncCallbackManagerForRetrieverRun
    ) -> List[Document]:
        """Asynchronously get documents relevant to a query.
        Args:
            query: String to find relevant documents for
            run_manager: The callbacks handler to use
        Returns:
            List of relevant documents
        """
        inputs = self.llm_chain.prep_inputs({"query": query})

        structured_query = cast(
            StructuredQuery,
            # Instead of calling 'self.llm_chain.predict_and_parse' here, 
            # I changed it to leveraging 'self.llm_chain.prompt.output_parser.parse' 
            # and 'self.llm_chain.apredict'
            # ↓↓↓↓↓↓↓
            self.llm_chain.prompt.output_parser.parse(
                await self.llm_chain.apredict(
                    callbacks=run_manager.get_child(), **inputs
                )
            ),
        )
        if self.verbose:
            print(structured_query)
        new_query, new_kwargs = self.structured_query_translator.visit_structured_query(
            structured_query
        )
        if structured_query.limit is not None:
            new_kwargs["k"] = structured_query.limit

        if self.use_original_query:
            new_query = query

        search_kwargs = {**self.search_kwargs, **new_kwargs}
        docs = await self.vectorstore.asearch(
            new_query, self.search_type, **search_kwargs
        )
        return docs

a92340a commented 11 months ago

I posted a very similar issue #10462, using SelfQueryRetriever. I received some sort of solution from the chatbot. However, I don't have a clue how to implement it.

I solved it by extending SelfQueryRetriever and overwriting the _get_relevant_documents method. Below is an example (In this example, I overwrite _aget_relevant_documents, because I need async features in my case. You can do the same to _get_relevant_documents. ):

Hi! @ellisxu Very appreciated with your sharing! Can you show the packages which are imported from? Thanks a lot!

ellisxu commented 11 months ago

I posted a very similar issue #10462, using SelfQueryRetriever. I received some sort of solution from the chatbot. However, I don't have a clue how to implement it.

I solved it by extending SelfQueryRetriever and overwriting the _get_relevant_documents method. Below is an example (In this example, I overwrite _aget_relevant_documents, because I need async features in my case. You can do the same to _get_relevant_documents. ):

Hi! @ellisxu Very appreciated with your sharing! Can you show the packages which are imported from? Thanks a lot!

The package is from langchain.retrievers.self_query.base import SelfQueryRetriever and the langchain version I use is 0.0.279.

nickeleres commented 10 months ago

I am still running into this with SelfQueryRetriever using langchain==0.0.302....is there any resolution here?

ellisxu commented 10 months ago

I am still running into this with SelfQueryRetriever using langchain==0.0.302....is there any resolution here?

https://github.com/langchain-ai/langchain/issues/6819#issuecomment-1720942610 Try this. :)

tvmaly commented 6 months ago

This is unfortunate as this part of Lang Chain is used in the DeepLearningAI course

langchain-ai / langchain