langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.5k stars 15.28k forks source link

1 validation error for FewShotChatMessagePromptTemplate input_variables field required (type=value_error.missing) #24108

Closed krisnagita closed 3 months ago

krisnagita commented 3 months ago

Checked other resources

Example Code

def chat_to_sql_validator(input_prompt, chat_history, database, model_type):

    print(f"Database used: {database.dialect}")
    print(f"Usable table: {database.get_usable_table_names()[0]}\n\n")

    if model_type == "gpt-4o":
        model = ChatGPT()

    elif model_type == "gemini-pro":
        model = Gemini()

    toolkit = SQLDatabaseToolkit(db = database,
                                 llm = model,
                                 agent_type = "tool-calling",
                                 verbose = False)

    snippet_data = toolkit.get_context()["table_info"]

    current_date = date.today().strftime("%Y-%m-%d")

    examples = [
        {
            "input": "Example 1",
            "output": "Example 1"
        },
        {
            "input": "Example 2",
            "output": "Example 2"
        },
        {
            "input": "Example 3",
            "output": "Example 3"
        }
    ]

    system = """
    You are a {dialect} expert. Given a human chat history and question, your task is to create a syntactically correct {dialect} query to run. 
    Unless the user specifies in the question a specific number of examples to obtain, query for at most {top_k} results using the LIMIT clause as per {dialect}. 
    You can order the results to return the most informative data in the database.
    Never query for all columns from a table. You must query only the columns that are needed to answer the question.
    Pay attention to use only the column names you can see in the tables below. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.
    Pay attention to use date('now') function to get the current date, if the question involves "today".

    Only use the following tables:
    {table_info}

    There is some rules that defined by human to generate syntactically correct {dialect} query:
    1. Text Search: 
        1.1. For partial matches in the following columns, use the LIKE operator:
            1.1.1. Branch Name
            1.1.2. Store Name
            1.1.3. Product Name
            1.1.4. Principal Name
            1.1.5. Product Type
            1.1.6. Product Brand Name
            1.1.7. Product Division Name
            1.1.8. Product Department Name
            1.1.9. Product Category Name
        1.2. Ensure case-insensitive search using the UPPER function:
               UPPER(column_name) LIKE UPPER('%search_term%')
        1.3. Replace spaces in the search term with '%' for wildcard matching.

    2. Counting Distinct Values:
        2.1. Use the COUNT DISTINCT function to calculate the total number of unique values in the following columns:
            2.1.1. Branch Code or Branch Name
            2.1.2. Store Code or Store Name
            2.1.3. Product Code or Product Name
            2.1.4. Principal Code or Principal Name
            2.1.5. Product Type
            2.1.6. Product Brand Name
            2.1.7. Product Division Name
            2.1.8. Product Department Name
            2.1.9. Product Category Name

    3. Summing Values:
        3.1. Transactions and Sales Gross must use the SUM function
        3.2. Quantity Purchase Order and Amount Purchase Order must use the SUM function

    4. Data Aggregation:
        4.1. Perform appropriate data aggregation based on the user's question. This could include:
           4.1.1. SUM: To calculate the total value.
           4.1.2. AVG: To calculate the average value.
           4.1.3. MIN, MAX: To find the minimum or maximum value.

    5. Sorting:
        If the result is a list, sort the values according to the user's question. 
        Specify the column and sorting order (ASC for ascending, DESC for descending).

    6. Data Structure Awareness:
        Understand that 'Branch' and 'Store' are not equivalent entities within the data. 
        This means that queries should be structured to differentiate between these entities when necessary.

    Within the rule by human write a draft query. Then double check the {dialect} query for common mistakes, including:
    - Always make column data from datetime or date cast into string
    - Not using GROUP BY for aggregating data
    - Using NOT IN with NULL values
    - Using UNION when UNION ALL should have been used
    - Using BETWEEN for exclusive ranges
    - Data type mismatch in predicates
    - Properly quoting identifiers
    - Using the correct number of arguments for functions
    - Casting to the correct data type
    - Using the proper columns for joins

    You must you this format:
    First draft: <<FIRST_DRAFT_QUERY>>
    Final answer: <<FINAL_ANSWER_QUERY>>

    Here are history question from human, to help you understand the context:
    {chat_history}

    Here is the snippet of the data to help you understand more about the table:
    {snippet_data}

    Here is the current date if asked about today, the format date is in YYYY-MM-DD:
    {current_date}

    Your query answer must align with the question from Human. If the question asking 10 then show 10 rows.
    """

    try:
        example_prompt = ChatPromptTemplate.from_messages(
            [
                ("human", "{input}"),
                ("ai", "{output}"),
            ]
        )

        few_shot_prompt = FewShotChatMessagePromptTemplate(example_prompt = example_prompt, examples = examples)

        prompt = ChatPromptTemplate.from_messages([("system", system), few_shot_prompt, ("human", "{input}")]).partial(dialect=database.dialect, chat_history=chat_history, snippet_data=snippet_data, current_date=current_date)

        chain = create_sql_query_chain(model, database, prompt=prompt, k=10)
        output_query = chain.invoke({"question": input_prompt})

    except Exception as Error:
        output_query = ""
        prompt = ""

    output_prompt = {
        "output_prompt" : prompt,
        "output_script" : output_query,
    }

    return output_prompt

Error Message and Stack Trace (if applicable)

The error message was:

File "/workspace/main.py", line 359, in chat_to_sql_validator
    few_shot_prompt = FewShotChatMessagePromptTemplate(example_prompt=ChatPromptTemplate.from_messages([("human", "{input}"), ("ai", "{output}")]), examples = examples)
  File "/layers/google.python.pip/pip/lib/python3.8/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for FewShotChatMessagePromptTemplate
input_variables
  field required (type=value_error.missing)"

Description

System Info

functions-framework==3.* google-cloud-aiplatform google-cloud-storage google-cloud-bigquery-storage google-api-python-client google-auth langchain langchain-openai langchain-community langchain-google-vertexai langchain-openai tiktoken nest-asyncio bs4 faiss-cpu langchain_experimental tabulate pandas-gbq sqlalchemy sqlalchemy-bigquery flask

I've also tried different version of langchain and also didn't work

Selroy46 commented 3 months ago

I've also tried different version of langchain and also didn't work

Did you try langchain 0.1.*? I had the same problem today when I rebuilded my project because langchain~=0.1.20 in my requirements.txt installed langchain==0.2.7 that was not compatible.

krisnagita commented 3 months ago

I've tried your solution and its worked for me. Thank you @Selroy46

siddhantoon commented 3 months ago

I have 0.2.18 installed I ran the example in the documentation here I am getting the same

ValidationError                           Traceback (most recent call last)
Cell In[11], [line 15](vscode-notebook-cell:?execution_count=11&line=15)
      [6](vscode-notebook-cell:?execution_count=11&line=6) examples = [
      [7](vscode-notebook-cell:?execution_count=11&line=7)     {"input": "2+2", "output": "4"},
      [8](vscode-notebook-cell:?execution_count=11&line=8)     {"input": "2+3", "output": "5"},
      [9](vscode-notebook-cell:?execution_count=11&line=9) ]
     [11](vscode-notebook-cell:?execution_count=11&line=11) example_prompt = ChatPromptTemplate.from_messages(
     [12](vscode-notebook-cell:?execution_count=11&line=12)     [('human', '{input}'), ('ai', '{output}')]
     [13](vscode-notebook-cell:?execution_count=11&line=13) )
---> [15](vscode-notebook-cell:?execution_count=11&line=15) few_shot_prompt = FewShotChatMessagePromptTemplate(
     [16](vscode-notebook-cell:?execution_count=11&line=16)     examples=examples,
     [17](vscode-notebook-cell:?execution_count=11&line=17)     # This is a prompt template used to format each individual example.
     [18](vscode-notebook-cell:?execution_count=11&line=18)     example_prompt=example_prompt,
     [19](vscode-notebook-cell:?execution_count=11&line=19) )
     [21](vscode-notebook-cell:?execution_count=11&line=21) final_prompt = ChatPromptTemplate.from_messages(
     [22](vscode-notebook-cell:?execution_count=11&line=22)     [
     [23](vscode-notebook-cell:?execution_count=11&line=23)         ('system', 'You are a helpful AI Assistant'),
   (...)
     [26](vscode-notebook-cell:?execution_count=11&line=26)     ]
     [27](vscode-notebook-cell:?execution_count=11&line=27) )
     [28](vscode-notebook-cell:?execution_count=11&line=28) final_prompt.format(input="What is 4+4?")

File c:\ProgramData\Anaconda3\envs\grag\Lib\site-packages\pydantic\v1\main.py:341, in BaseModel.__init__(__pydantic_self__, **data)
    [339](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:339) values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
    [340](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:340) if validation_error:
--> [341](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:341)     raise validation_error
    [342](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:342) try:
    [343](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:343)     object_setattr(__pydantic_self__, '__dict__', values)

ValidationError: 1 validation error for FewShotChatMessagePromptTemplate
input_variables
  field required (type=value_error.missing)`
AndyEverything commented 3 months ago

I have 0.2.18 installed I ran the example in the documentation here I am getting the same

ValidationError                           Traceback (most recent call last)
Cell In[11], [line 15](vscode-notebook-cell:?execution_count=11&line=15)
      [6](vscode-notebook-cell:?execution_count=11&line=6) examples = [
      [7](vscode-notebook-cell:?execution_count=11&line=7)     {"input": "2+2", "output": "4"},
      [8](vscode-notebook-cell:?execution_count=11&line=8)     {"input": "2+3", "output": "5"},
      [9](vscode-notebook-cell:?execution_count=11&line=9) ]
     [11](vscode-notebook-cell:?execution_count=11&line=11) example_prompt = ChatPromptTemplate.from_messages(
     [12](vscode-notebook-cell:?execution_count=11&line=12)     [('human', '{input}'), ('ai', '{output}')]
     [13](vscode-notebook-cell:?execution_count=11&line=13) )
---> [15](vscode-notebook-cell:?execution_count=11&line=15) few_shot_prompt = FewShotChatMessagePromptTemplate(
     [16](vscode-notebook-cell:?execution_count=11&line=16)     examples=examples,
     [17](vscode-notebook-cell:?execution_count=11&line=17)     # This is a prompt template used to format each individual example.
     [18](vscode-notebook-cell:?execution_count=11&line=18)     example_prompt=example_prompt,
     [19](vscode-notebook-cell:?execution_count=11&line=19) )
     [21](vscode-notebook-cell:?execution_count=11&line=21) final_prompt = ChatPromptTemplate.from_messages(
     [22](vscode-notebook-cell:?execution_count=11&line=22)     [
     [23](vscode-notebook-cell:?execution_count=11&line=23)         ('system', 'You are a helpful AI Assistant'),
   (...)
     [26](vscode-notebook-cell:?execution_count=11&line=26)     ]
     [27](vscode-notebook-cell:?execution_count=11&line=27) )
     [28](vscode-notebook-cell:?execution_count=11&line=28) final_prompt.format(input="What is 4+4?")

File c:\ProgramData\Anaconda3\envs\grag\Lib\site-packages\pydantic\v1\main.py:341, in BaseModel.__init__(__pydantic_self__, **data)
    [339](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:339) values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
    [340](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:340) if validation_error:
--> [341](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:341)     raise validation_error
    [342](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:342) try:
    [343](file:///C:/ProgramData/Anaconda3/envs/grag/Lib/site-packages/pydantic/v1/main.py:343)     object_setattr(__pydantic_self__, '__dict__', values)

ValidationError: 1 validation error for FewShotChatMessagePromptTemplate
input_variables
  field required (type=value_error.missing)`

I also did the example from the documentation and get the same error. Not sure if it is a problem with LangChain or an error in the documentation.

langchain=0.2.7

Minato252 commented 3 months ago

I encountered the same error but resolved it by adding the input_variables=[] parameter in FewShotChatMessagePromptTemplate().

examples = [
    {"input": "2+2", "output": "4"},
    {"input": "2+3", "output": "5"},
]

example_prompt = ChatPromptTemplate.from_messages(
    [('human', '{input}'), ('ai', '{output}')]
)

few_shot_prompt = FewShotChatMessagePromptTemplate(
    examples=examples,
    # This is a prompt template used to format each individual example.
    example_prompt=example_prompt,
    input_variables=[]
)

print(few_shot_prompt.invoke({}).to_messages())
krisnagita commented 3 months ago

I encountered the same error but resolved it by adding the input_variables=[] parameter in FewShotChatMessagePromptTemplate().

examples = [
    {"input": "2+2", "output": "4"},
    {"input": "2+3", "output": "5"},
]

example_prompt = ChatPromptTemplate.from_messages(
    [('human', '{input}'), ('ai', '{output}')]
)

few_shot_prompt = FewShotChatMessagePromptTemplate(
    examples=examples,
    # This is a prompt template used to format each individual example.
    example_prompt=example_prompt,
    input_variables=[]
)

print(few_shot_prompt.invoke({}).to_messages())

I've tried adding input_variables=[] on my case, and it is works. So my script looked like this:

few_shot_prompt = FewShotChatMessagePromptTemplate(example_prompt = example_prompt, examples = examples, input_variables=[])