langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
89.45k stars 14.12k forks source link

New mypy type error from PydanticOutputParser #20634

Open rpgoldman opened 3 months ago

rpgoldman commented 3 months ago

Checked other resources

Example Code

The following code used to pass mypy checks until I updated langchain_core from 0.1.31 to 0.1.44

from langchain.output_parsers import PydanticOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.language_models import BaseChatModel

class InputSourceResponse(BaseModel):
    """
    Response for Input Source Query
    """
    input_sources: dict[str, str] = Field(
        description=('Input source dictionary for the provided function where '
                     'key is <SUPERTYPE> and value is <SUBTYPE> '
                     '(set <SUBTYPE> to NONE if no subtype exists).'
                     )
    )
    explanation: str = Field(
        description='Explanation for input sources for the provided function')

def do_parser(llm: BaseChatModel, input: str) -> InputSourceResponse:
    parser = PydanticOutputParser(pydantic_object=InputSourceResponse)
    res: InputSourceResponse = (llm | parser).invoke(input)
    return res

Now I get this mypy error message:

lacrosse_llm/langchain_bug.py:19: error: Value of type variable "TBaseModel" of "PydanticOutputParser" cannot be "InputSourceResponse"  [type-var]

Error Message and Stack Trace (if applicable)

langchain_bug.py:19: error: Value of type variable "TBaseModel" of "PydanticOutputParser" cannot be "InputSourceResponse"  [type-var]

Description

System Info

langchain==0.1.16 langchain-anthropic==0.1.11 langchain-community==0.0.33 langchain-core==0.1.44 langchain-google-vertexai==1.0.1 langchain-openai==0.0.8 langchain-text-splitters==0.0.1

MacOS 14.4.1

python version = 3.12.2

rpgoldman commented 3 months ago

Not sure this is correct, but it might be related to the merge of #18811, in which case it's not the update to langchain-core, but to langchain 0.1.16

rpgoldman commented 3 months ago

Interestingly, the following code snippet does not cause mypy to report a type error:

def make_parser() -> PydanticOutputParser:
    return PydanticOutputParser(pydantic_object=InputSourceResponse)

However, the following, which should be semantically identical, gives the same error:

def make_parser() -> PydanticOutputParser:
    parser = PydanticOutputParser(pydantic_object=InputSourceResponse)
    return parser

On the other hand, this does not:

def make_parser() -> PydanticOutputParser:
    return PydanticOutputParser(pydantic_object=InputSourceResponse)

parser = make_parser()

The following is also OK:

def make_parser() -> PydanticOutputParser:
    return PydanticOutputParser(pydantic_object=InputSourceResponse)

def doit():
    parser = make_parser()
    return parser

Somehow the assignment from the constructor makes all the difference!

rpgoldman commented 3 months ago

I believe I have found the problem:

print(isinstance(InputSourceResponse, langchain_core.output_parsers.pydantic.PydanticBaseModel))

prints False.

Digging further, this is because PYDANTIC_MAJOR_VERSION == 2 despite the fact that I am using the pydantic_v1 library!

This suggests that either

I'm not familiar enough with the architecture to know which of these alternatives is correct, but I believe that it was not the intent to remove support for Pydantic v1, was it?