MRKL agent is passing "Observation" text to tools when using non-OpenAI LLMs and breaks structured input tools

thoraxe commented 1 year ago

System Info

Python 3.9.16 Langchain 0.0.326 Linux (Fedora 37)

Who can help?

No response

Information

[ ] The official example notebooks/scripts
[X] My own modified scripts

Related Components

[ ] LLMs/Chat Models
[ ] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[X] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[X] Agents / Agent Executors
[ ] Tools / Toolkits
[ ] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

from langchain.utilities import SerpAPIWrapper
from langchain.agents import AgentType, initialize_agent
from langchain.tools import Tool

import langchain
langchain.debug = True

from langchain.llms import OpenAI
llm = OpenAI(temperature=0, verbose=True)

search = SerpAPIWrapper()

events_tool = Tool(
    name="events_tool_serp",
    description="A tool to look up current events",
    func=search.run,
)
tools = [events_tool]

agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

agent.run("What happened yesterday?")

When the above code is run with OpenAI, things work OK. However, when swapped for a different LLM endpoint:

# using IBM watsonx.ai
from model_context import get_watsonx_predictor
llm = get_watsonx_predictor(
    model="codellama/codellama-34b-instruct", min_new_tokens=5, verbose=True
)

Odd things happen:

[llm/start] [1:chain:AgentExecutor > 2:chain:LLMChain > 3:llm:LangChainInterface] Entering LLM run with input:
{
  "prompts": [
    "Answer the following questions as best you can. You have access to the following tools:\n\nevents_tool_serp: A tool to look up current events\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [events_tool_serp]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: What happened yesterday?\nThought:"
  ]
}
[llm/end] [1:chain:AgentExecutor > 2:chain:LLMChain > 3:llm:LangChainInterface] [2.65s] Exiting LLM run with output:
{
  "generations": [
    [
      {
        "text": " I should look up current events\nAction: events_tool_serp\nAction Input: yesterday\nObservation:",
        "generation_info": {
          "generated_token_count": 25,
          "input_token_count": 160,
          "stop_reason": "STOP_SEQUENCE",
          "stop_sequence": "\nObservation:",
          "input_text": "Answer the following questions as best you can. You have access to the following tools:\n\nevents_tool_serp: A tool to look up current events\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [events_tool_serp]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: What happened yesterday?\nThought:"
        },
        "type": "Generation"
      }
    ]
  ],
  "llm_output": {
    "token_usage": {
      "generated_token_count": 25,
      "input_token_count": 160
    }
  },
  "run": null
}
[chain/end] [1:chain:AgentExecutor > 2:chain:LLMChain] [2.65s] Exiting Chain run with output:
{
  "text": " I should look up current events\nAction: events_tool_serp\nAction Input: yesterday\nObservation:"
}
[tool/start] [1:chain:AgentExecutor > 4:tool:events_tool_serp] Entering Tool run with input:
"yesterday
Observation:"

The Observation: text is being passed into the tool. For simple tools like search engines, this often is not a problem. However, when your tool expects a structured input, you sometimes end up with tool input like the following:

"foo,bar
Observation:"

This results in the tool attempting to do something like splitting on , and then the second part of the resulting array is:

bar
Observation:

Which definitely breaks the tool.

Expected behavior

The Observation: text should definitely not be passed as part of the tool input.

The MRKL output parser is here: https://github.com/langchain-ai/langchain/blob/v0.0.327/libs/langchain/langchain/agents/mrkl/output_parser.py#L26-L77

The regex does appear to match the groups correctly: https://regex101.com/r/v7Tg1n/1

It's unclear, though, why the Observation: text is ending up getting passed into the input.

Although, when you use newline characters (\n) in the string, and not actual newlines, it does appear that the Observation part is being caught in the group: https://regex101.com/r/MMCvYn/1

I tried to spelunk into the code to figure out when the agent "base" calls parse to see why the Observation text would be getting passed into parse in the first place, but it seems that one possible fix here would be to modify the parse RegEx for MRKL to make sure it does NOT capture the observation component.

Changing the RegEx in the MRKL agent to the following:

            r"Action\s*\d*\s*:[\s]*(.*?)[\s]*Action\s*\d*\s*Input\s*\d*\s*:[\s]*(.*)Observation:"

And then modifying the tool input strip:

        if action_match:
            action = action_match.group(1).strip()
            action_input = action_match.group(2)
            tool_input = action_input.strip()

fixes the issue if "Observation" got passed in, but I am assuming that parse shouldn't have gotten this in the first place.

dosubot[bot] commented 1 year ago

🤖

Hi @thoraxe,

It's good to see you again and thank you for the detailed issue report.

Based on the context provided, it seems like the issue you're experiencing is related to the way the parse function in the MRKL agent is handling the output text from the language model. The parse function is responsible for interpreting the output from the language model and is called whenever the language model produces an output.

The issue seems to arise when the "Observation" text is included in the tool input. This is likely due to the way the HUMAN_MESSAGE is formatted in the prompt.py file. The HUMAN_MESSAGE is currently set to include both the user's input and the agent's scratchpad, which includes the "Observation" text.

To resolve this issue, you can modify the HUMAN_MESSAGE to only include the user's input. This can be done by removing the {agent_scratchpad} from the HUMAN_MESSAGE. Here's how you can do it:

HUMAN_MESSAGE = "{input}"

This change will ensure that only the user's input is passed as part of the tool input when using non-OpenAI LLMs in the LangChain framework.

In addition, there have been similar issues in the past where the regex used in the agent was not handling multiline Action Inputs correctly. The solution was to switch the regex used in the conversational agent to the regex used in the mrkl agent. You can find more details about this in the following issues:

I hope this helps! If you have any other questions or need further clarification, please don't hesitate to ask.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

thoraxe commented 1 year ago

@dosu-bot Unfortunately, your first suggestion doesn't quite work.

The MRKL agent doesn't use HUMAN_MESSAGE: https://github.com/langchain-ai/langchain/blob/v0.0.327/libs/langchain/langchain/agents/mrkl/prompt.pyz

HUMAN_MESSAGE is explicitly part of the chat agent: https://github.com/langchain-ai/langchain/blob/v0.0.327/libs/langchain/langchain/agents/chat/prompt.py#L30

Changing the HUMAN_MESSAGE in the reproducer I posted in the OP did not change the behavior -- the Observation is still passed to the tool.

Your second suggestion doesn't make sense:

The solution was to switch the regex used in the conversational agent to the regex used in the mrkl agent.

But I am using the MRKL agent and not the conversational agent. And the MRKL agent regex is what's broken.

dosubot[bot] commented 1 year ago

🤖

Based on the information you've provided and the context I've found, it seems like the "Observation" text is being passed to tools when using non-OpenAI LLMs in the LangChain Python framework version 0.0.326 because of the way the _generate and _agenerate methods are implemented in the PromptLayerOpenAI and PromptLayerOpenAIChat classes.

In these methods, the promptlayer_api_request and promptlayer_api_request_async functions are called with the prompts list as an argument. This list contains the "Observation" text.

To resolve this issue, you can modify the _generate and _agenerate methods to exclude the "Observation" text when calling the promptlayer_api_request and promptlayer_api_request_async functions.

Here is an example of how you can modify the _generate method in the PromptLayerOpenAI class:

def _generate(
    self,
    prompts: List[str],
    stop: Optional[List[str]] = None,
    run_manager: Optional[CallbackManagerForLLMRun] = None,
    **kwargs: Any,
) -> LLMResult:
    from promptlayer.utils import get_api_key, promptlayer_api_request
    request_start_time = datetime.datetime.now().timestamp()
    generated_responses = super()._generate(prompts, stop, run_manager)
    request_end_time = datetime.datetime.now().timestamp()
    for i in range(len(prompts)):
        prompt = prompts[i]
        generation = generated_responses.generations[i][0]
        resp = {
            "text": generation.text,
            "llm_output": generated_responses.llm_output,
        }
        params = {**self._identifying_params, **kwargs}
        # Exclude the "Observation" text from the prompts list
        pl_request_id = promptlayer_api_request(
            "langchain.PromptLayerOpenAI",
            "langchain",
            [prompt.replace("Observation: ", "")],
            params,
            self.pl_tags,
            resp,
            request_start_time,
            request_end_time,
            get_api_key(),
            return_pl_id=self.return_pl_id,
        )
        if self.return_pl_id:
            if generation.generation_info is None or not isinstance(
                generation.generation_info, dict
            ):
                generation.generation_info = {}
            generation.generation_info["pl_request_id"] = pl_request_id
    return generated_responses

In this modification, the "Observation: " string is removed from the prompt before it is passed to the promptlayer_api_request function. You can apply a similar modification to the _agenerate method and to the PromptLayerOpenAIChat class.

You can find the original implementation of these methods in the libs/langchain/langchain/llms/promptlayer_openai.py file in the LangChain repository.

I also found a similar issue in the LangChain repository: LLM making its own observation when a tool should be used. This issue was closed by a pull request: Strip trailing whitespaces in agent's stop sequences. You might find the discussion and solution in this issue and pull request helpful.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

thoraxe commented 1 year ago

@dosu-bot

I am not using OpenAI, so promptlayer_openai.py is not relevant. In fact, I'm doing the opposite - using non-OpenAI LLMs.

dosubot[bot] commented 1 year ago

🤖

Based on the information you've provided, it seems like the issue you're experiencing is related to the observation_prefix property in the StructuredChatAgent class. This property is set to return "Observation: " as a string, which is likely used to prepend the observation text before it is passed to the tools.

Here's the relevant code snippet:

@property
def observation_prefix(self) -> str:
    """Prefix to append the observation with."""
    return "Observation: "

You can find this in the libs/langchain/langchain/agents/structured_chat/base.py file.

To resolve this issue, you could override this property in a subclass or modify the StructuredChatAgent class directly to return an empty string or a different prefix, depending on your requirements. Here's an example of how you can override this property in a subclass:

class CustomStructuredChatAgent(StructuredChatAgent):
    @property
    def observation_prefix(self) -> str:
        """Prefix to append the observation with."""
        return ""  # return an empty string or a different prefix

Please note that this change might affect other parts of your code that rely on this prefix. Make sure to test your application thoroughly after making this change.

Without additional context or usage examples, it's hard to provide a more detailed explanation of how the observation_prefix property is used in the LangChain codebase. If you have more information or specific examples, it would be helpful in providing a more accurate solution.

Sources

libs/langchain/langchain/agents/structured_chat/base.py

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

imfht commented 10 months ago

@dosu-bot

I am not using OpenAI, so promptlayer_openai.py is not relevant. In fact, I'm doing the opposite - using non-OpenAI LLMs.

I run into same problem when using non openai model. do you found the solution now? any help for information would be thanks!

deanchanter commented 8 months ago

I came across this today as I beating my head against the wall for a similar issue trying to build a react agent. @dosu-bot was close. The problem was for me the templates assume that models will drop the stop sequence during the tool input generation. So what I did to make my life easier was I change the prompt template to have a different stop and adjust my llm according. If you still want "observation" printed out in the scratchpad add it to the tool output like @dosu-bot shared from the old agents.

from typing import Optional

from dotenv import load_dotenv
from langchain.agents import AgentExecutor
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents.output_parsers import JSONAgentOutputParser
from langchain.memory import ConversationBufferMemory
from langchain.tools.render import render_text_description_and_args
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnablePassthrough
from langchain_core.tools import BaseTool

from genai import Client, Credentials
from genai.extensions.langchain import LangChainChatInterface
from genai.schema import TextGenerationParameters

load_dotenv()

from langchain.agents.react.output_parser import ReActOutputParser

from langchain_core.prompts import PromptTemplate

prompt_template = """Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

Overall, Assistant is a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.

TOOLS:
------

Assistant has access to the following tools:

{tools}

To use a tool, please use the following format:

Thought: Do I need to use a tool? Yes
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Run

When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the format:

Thought: Do I need to use a tool? No
Final Answer: [your response here]

Begin!

Previous conversation history:
{chat_history}

New input: {input}
{agent_scratchpad}"""

prompt = PromptTemplate.from_template(prompt_template)

class WordLengthTool(BaseTool):
    name = "GetWordLength"
    description = "Returns the length of a word."

    def _run(self, word: str, run_manager: Optional[CallbackManagerForToolRun] = None) -> int:
        return len(word)

tools: list[BaseTool] = [WordLengthTool()]

client = Client(credentials=Credentials.from_env())
llm = LangChainChatInterface(
    client=client,
    model_id='meta-llama/llama-2-70b-chat',
    parameters=TextGenerationParameters(
        max_new_tokens=250, 
        #min_new_tokens=20, 
        #temperature=0, 
        #decoding_method="sample",
        temperature = 0,
        #top_p = .85,
        #top_k = 50,
        #repetition_penalty = 1.05,
        #max_new_tokens = 2048,
        stop_sequences=["\nRun"],
        include_stop_sequence = False
    ),
)
prompt = prompt.partial(
    # get tools with their descriptions and args in plain text
    tools=render_text_description_and_args(list(tools)),
    tool_names=", ".join([t.name for t in tools]),
)

memory = ConversationBufferMemory()

def get_mem(input):
    return memory.buffer_as_str

agent = (
    RunnablePassthrough.assign(
        # format the agent's scratchpad to a string
        agent_scratchpad=lambda x: format_log_to_str(x["intermediate_steps"]),
        # pass the memory as the chat history
        chat_history=get_mem,
    )
    | prompt
    | llm
    | ReActSingleInputOutputParser()
)
agent_executor = AgentExecutor(agent=agent, tools=tools, handle_parsing_errors=True, verbose=True, memory=memory)

dosubot[bot] commented 5 months ago

Hi, @thoraxe

I'm helping the LangChain team manage their backlog and am marking this issue as stale. From what I understand, the issue involves the MRKL agent passing "Observation" text to tools when using non-OpenAI LLMs, causing structured input tools to break. There have been suggestions from users dosubot and deanchanter on potential solutions, but it seems that none of them fully addressed the issue as per my clarification.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, kindly let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

lotrdinesh commented 4 months ago

Adding the below in case this is useful for someone else. I am not sure if this is relevant for the case in point (seemed like a similar issue).

I had a similar issue while using "langchain.agents.create_react_agent". Passing stop_sequence as "False" worked for me. In case one needs to avoid hallucinations, we can use a different "stop_sequence" string (Default is "Observation:". This issue comes if the prompt contains "\nObservation:" string).

I am not sure why this issue does not come for OpenAI LLMs.

langchain-ai / langchain