langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.12k stars 15.21k forks source link

Encounter Error (KeyError: {'format_instructions'})while using StructuredOutputParser #15801

Closed plaban1981 closed 4 months ago

plaban1981 commented 9 months ago

System Info

langchain==0.0.350 python==3.9.2rc1

Who can help?

@agola11

Sample code

from langchain.output_parsers import ResponseSchema, StructuredOutputParser
from langchain.prompts import PromptTemplate

response_schemas = [
    ResponseSchema(name="result", description="answer to the user's question"),
    ResponseSchema(
        name="source_documents",
        description="source used to answer the user's question",
    ),
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
#
format_instructions = output_parser.get_format_instructions()
#
llms = LlamaCpp(streaming=True,
            model_path=r"C:\Users\PLNAYAK\Documents\Local_LLM_Inference\zephyr-7b-alpha.Q4_K_M.gguf",
            max_tokens = 500,
            temperature=0.75,
            top_p=1,
            model_kwargs={"gpu_layers":0,"stream":True},
            verbose=True,n_threads = int(os.cpu_count()/2),
            n_ctx=4096)
#
prompt = PromptTemplate(
    template="Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\n{context}\n\nQuestion: {question}\nHelpful Answer:",
    input_variables=["context","question"],
    partial_variables={"format_instructions": format_instructions},
    output_parser=output_parser
)
#
chain = prompt | llms | output_parser
chain.invoke({"question":query,"context":complete_context})

Error Log

KeyError Traceback (most recent call last) Cell In[41], line 1 ----> 1 chain.invoke({"question":query,"context":complete_context})

File c:\Users\PLNAYAK\Documents\Local_LLM_Inference\llms\lib\site-packages\langchain_core\runnables\base.py:1514, in RunnableSequence.invoke(self, input, config) 1512 try: 1513 for i, step in enumerate(self.steps): -> 1514 input = step.invoke( 1515 input, 1516 # mark each step as a child run 1517 patch_config( 1518 config, callbacks=run_manager.get_child(f"seq:step:{i+1}") 1519 ), 1520 ) 1521 # finish the root run 1522 except BaseException as e:

File c:\Users\PLNAYAK\Documents\Local_LLM_Inference\llms\lib\site-packages\langchain_core\prompts\base.py:94, in BasePromptTemplate.invoke(self, input, config) 91 def invoke( 92 self, input: Dict, config: Optional[RunnableConfig] = None 93 ) -> PromptValue: ---> 94 return self._call_with_config( 95 self._format_prompt_with_error_handling, 96 input, 97 config, 98 run_type="prompt", 99 )

File c:\Users\PLNAYAK\Documents\Local_LLM_Inference\llms\lib\site-packages\langchain_core\runnables\base.py:886, in Runnable._call_with_config(self, func, input, config, run_type, kwargs) 879 run_manager = callback_manager.on_chain_start( 880 dumpd(self), 881 input, 882 run_type=run_type, 883 name=config.get("run_name"), 884 ) 885 try: --> 886 output = call_func_with_variable_args( 887 func, input, config, run_manager, kwargs 888 ) 889 except BaseException as e: 890 run_manager.on_chain_error(e)

File c:\Users\PLNAYAK\Documents\Local_LLM_Inference\llms\lib\site-packages\langchain_core\runnables\config.py:308, in call_func_with_variable_args(func, input, config, run_manager, kwargs) 306 if run_manager is not None and accepts_run_manager(func): 307 kwargs["run_manager"] = run_manager --> 308 return func(input, kwargs)

File c:\Users\PLNAYAK\Documents\Local_LLM_Inference\llms\lib\site-packages\langchain_core\prompts\base.py:89, in BasePromptTemplate._format_prompt_with_error_handling(self, inner_input) 83 except KeyError as e: 84 raise KeyError( 85 f"Input to {self.class.name} is missing variable {e}. " 86 f" Expected: {self.input_variables}" 87 f" Received: {list(inner_input.keys())}" 88 ) from e ---> 89 return self.format_prompt(**input_dict)

File c:\Users\PLNAYAK\Documents\Local_LLM_Inference\llms\lib\site-packages\langchain_core\prompts\string.py:161, in StringPromptTemplate.format_prompt(self, kwargs) 159 def format_prompt(self, kwargs: Any) -> PromptValue: 160 """Create Chat Messages.""" --> 161 return StringPromptValue(text=self.format(**kwargs))

File c:\Users\PLNAYAK\Documents\Local_LLM_Inference\llms\lib\site-packages\langchain_core\prompts\prompt.py:132, in PromptTemplate.format(self, kwargs) 117 """Format the prompt with the inputs. 118 119 Args: (...) 129 prompt.format(variable1="foo") 130 """ 131 kwargs = self._merge_partial_and_user_variables(kwargs) --> 132 return DEFAULT_FORMATTER_MAPPING[self.template_format](self.template, **kwargs)

File C:\Program Files\Python39\lib\string.py:161, in Formatter.format(self, format_string, *args, *kwargs) 160 def format(self, format_string, /, args, **kwargs): --> 161 return self.vformat(format_string, args, kwargs)

File c:\Users\PLNAYAK\Documents\Local_LLM_Inference\llms\lib\site-packages\langchain_core\utils\formatting.py:29, in StrictFormatter.vformat(self, format_string, args, kwargs) 24 if len(args) > 0: 25 raise ValueError( 26 "No arguments should be provided, " 27 "everything should be passed as keyword arguments." 28 ) ---> 29 return super().vformat(format_string, args, kwargs)

File C:\Program Files\Python39\lib\string.py:166, in Formatter.vformat(self, format_string, args, kwargs) 164 usedargs = set() 165 result, = self._vformat(format_string, args, kwargs, used_args, 2) --> 166 self.check_unused_args(used_args, args, kwargs) 167 return result

File c:\Users\PLNAYAK\Documents\Local_LLM_Inference\llms\lib\site-packages\langchain_core\utils\formatting.py:18, in StrictFormatter.check_unused_args(self, used_args, args, kwargs) 16 extra = set(kwargs).difference(used_args) 17 if extra: ---> 18 raise KeyError(extra)

KeyError: {'format_instructions'}

Information

Related Components

Reproduction

Sample code

from langchain.output_parsers import ResponseSchema, StructuredOutputParser
from langchain.prompts import PromptTemplate

response_schemas = [
    ResponseSchema(name="result", description="answer to the user's question"),
    ResponseSchema(
        name="source_documents",
        description="source used to answer the user's question",
    ),
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
#
format_instructions = output_parser.get_format_instructions()
#
llms = LlamaCpp(streaming=True,
            model_path=r"C:\Users\PLNAYAK\Documents\Local_LLM_Inference\zephyr-7b-alpha.Q4_K_M.gguf",
            max_tokens = 500,
            temperature=0.75,
            top_p=1,
            model_kwargs={"gpu_layers":0,"stream":True},
            verbose=True,n_threads = int(os.cpu_count()/2),
            n_ctx=4096)
#
prompt = PromptTemplate(
    template="Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\n{context}\n\nQuestion: {question}\nHelpful Answer:",
    input_variables=["context","question"],
    partial_variables={"format_instructions": format_instructions},
    output_parser=output_parser
)
#
chain = prompt | llms | output_parser
chain.invoke({"question":query,"context":complete_context})

Expected behavior

It should return output in a structured format

dosubot[bot] commented 9 months ago

🤖

Hey @plaban1981! Good to see you back. Hope you're doing well! Let's dive into this issue you're experiencing.

Based on the code you've provided, it seems like you're trying to use the format_instructions returned by output_parser.get_format_instructions() as a partial_variables in your PromptTemplate. However, the get_format_instructions() method doesn't return a dictionary, but a string that represents the format instructions for the output parser.

The PromptTemplate class expects partial_variables to be a dictionary where the keys are variable names and the values are the corresponding values to be used in the template. In your case, you're trying to use a string (format_instructions) as a dictionary, which is causing the KeyError.

To resolve this issue, you need to ensure that partial_variables is a dictionary with the correct keys and values. If you want to use the format instructions in your template, you might need to modify your template to include a placeholder for the format instructions and then provide the format instructions as a value in the partial_variables dictionary. Here's an example:

prompt = PromptTemplate(
    template="Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\n{context}\n\nQuestion: {question}\nHelpful Answer: {format_instructions}",
    input_variables=["context","question"],
    partial_variables={"format_instructions": format_instructions},
    output_parser=output_parser
)

In this example, the template includes a placeholder for format_instructions, and the partial_variables dictionary includes a key-value pair for format_instructions.

Please note that this is a general suggestion and might need to be adjusted based on your specific use case and the structure of your PromptTemplate and StructuredOutputParser.

For more information about the PromptTemplate and StructuredOutputParser classes, you can refer to the LangChain documentation and the source code:

I hope this helps! If you have any further questions or if something is unclear, please let me know.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

okobboko commented 7 months ago

I meet the same error:“Got invalid JSON object. Error”