langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.04k stars 14.65k forks source link

Issue with chain output_schema when a runnableparrallel is invoked with assign. #17460

Closed sayan closed 1 month ago

sayan commented 6 months ago

Checked other resources

Example Code

The following code

from langchain_core.runnables import RunnableParallel
from langchain.prompts import ChatPromptTemplate

chain1 = ChatPromptTemplate.from_template("tell me a joke about {topic}")
chain2 =   ChatPromptTemplate.from_template("write a short (2 line) poem about {topic}")

def test(input) -> int:
    print(input)
    return(3)
combined = RunnableParallel(joke=chain1, poem=chain2).assign(x=RunnableLambda(test))

The output is correct if you do combined.invoke({'topic':"love"}) you correctly get

{'joke': ChatPromptValue(messages=[HumanMessage(content='tell me a joke about love')]),
 'poem': ChatPromptValue(messages=[HumanMessage(content='write a short (2 line) poem about love')]),
 'x': 3}

however if you check the output schema as follows

combined.output_schema.schema()

Output is

{'title': 'RunnableSequenceOutput',
 'type': 'object',
 'properties': {'topic': {'title': 'Topic', 'type': 'string'},
  'x': {'title': 'X', 'type': 'integer'}}}

joke field is missing in the output schema. This is impacting the langserve api output for the chain as well.

Error Message and Stack Trace (if applicable)

No response

Description

Output schema of the runnable is missing field when RunnableParallel is used in conjunction with assign.

System Info

langchain = "0.1.6" python = "^3.11"

dosubot[bot] commented 6 months ago

🤖

Hey @sayan, great to see you diving into LangChain again! Hope you've been well.

Thank you for bringing this to our attention. Based on the code you've provided and the behavior you're experiencing, it seems like there might be an issue with how the output schema is generated when using RunnableParallel in conjunction with assign in LangChain version 0.1.6.

In LangChain, the output schema of a RunnableSequence is determined by the _seq_output_schema function. If the last step in the sequence is an instance of RunnableAssign, the output schema is a combination of the output schema of the previous steps and the output schema of the mapper in the RunnableAssign step. If the 'joke' field is not present in either the output schema of the previous steps or the output schema of the mapper, it will not be included in the final output schema.

Here's the relevant code from the RunnableSequence class:

elif isinstance(last, RunnableAssign):
    mapper_output_schema = last.mapper.get_output_schema(config)
    prev_output_schema = _seq_output_schema(steps[:-1], config)
    if not prev_output_schema.__custom_root_type__:
        # it's a dict as expected
        return create_model(  # type: ignore[call-overload]
            "RunnableSequenceOutput",
            **{
                **{
                    k: (v.annotation, v.default)
                    for k, v in prev_output_schema.__fields__.items()
                },
                **{
                    k: (v.annotation, v.default)
                    for k, v in mapper_output_schema.__fields__.items()
                },
            },
            __config__=_SchemaConfig,
        )

To resolve this issue, you need to ensure that the 'joke' field is included in the output schema of either the previous steps or the mapper in the RunnableAssign step.

I hope this helps! If you have any further questions or need additional clarification, please don't hesitate to ask.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
lmurawsk commented 4 months ago

So in case of the question, fixing the problem requires sth. like this:

# class that defines expected output types:
class OutputType(BaseModel):
    joke: str
    poem: str
    x: str

combined = RunnableParallel(joke=chain1, poem=chain2).assign(x=RunnableLambda(test)).with_types(output_type=OutputType)

Then combined.output_schema.schema() is as it should be:

{'title': 'OutputType',
 'type': 'object',
 'properties': {'joke': {'title': 'Joke', 'type': 'string'},
  'poem': {'title': 'Poem', 'type': 'string'},
  'x': {'title': 'X', 'type': 'string'}},
 'required': ['joke', 'poem', 'x']}

and most importantly chain responses when served by langserve will be correct!!!

I spent tooo much time trying to figure this out. That should be mentioned on the main doc page: https://python.langchain.com/docs/expression_language/primitives/assign/