langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.8k stars 13.96k forks source link

PromptTemplate.input_types is ignored on validation #22668

Open gsimon75 opened 1 month ago

gsimon75 commented 1 month ago

Checked other resources

Example Code

        document_prompt = PromptTemplate(
            input_variables=["page_content", "metadata"],
            input_types={
                "page_content": str,
                "metadata": dict[str, Any],
            },
            output_parser=None,
            partial_variables={},
            template="{metadata['source']}: {page_content}",
            template_format="f-string",
            validate_template=True
        )

Error Message and Stack Trace (if applicable)

  File "/home/fules/src/ChatPDF/streamlitui.py", line 90, in <module>
    main()
  File "/home/fules/src/ChatPDF/streamlitui.py", line 51, in main
    st.session_state["pdfquery"] = PDFQuery(st.session_state["OPENAI_API_KEY"])
  File "/home/fules/src/ChatPDF/pdfquery.py", line 32, in __init__
    document_prompt = PromptTemplate(
  File "/home/fules/src/ChatPDF/_venv/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for PromptTemplate
__root__
  string indices must be integers (type=type_error)

Description

At this point input_types is not passed on to check_valid_template, so that type information is lost beyond this point, and therefore the validator couldn't consider the type even if it tried to.

At this point the validator validate_input_variables tries to resolve the template by assigning the string "foo" to all input variables, and this is where the exception is raised.

The documentation of PromptTemplate.input_types states that

A dictionary of the types of the variables the prompt template expects. If not provided, all variables are assumed to be strings.

If this behaviour (input_types is ignored and all variables are always assumed to be strings) is the intended one, then it might be good to reflect this in the documentation too.

System Info

$ pip freeze | grep langchain
langchain==0.2.2
langchain-community==0.2.3
langchain-core==0.2.4
langchain-openai==0.1.8
langchain-text-splitters==0.2.1

$ uname -a
Linux Lya 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

$ python --version
Python 3.10.6
spike-spiegel-21 commented 1 week ago

The input_types is an inherited member from class BasePromptTemplate. This is the reason that its docstring is visible in the documentation. Is there any way we can ignore this for sphinx documentation generation? @eyurtsev