STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION finishes chain BEFORE using a tool

FerAtTheFringe commented 1 year ago

System Info

I've recently changed to use this agent since I started getting errors with chat-conversational-react-description (about it not being able to use multi-input tools). I've noticed that it often finishes a chain telling the user that it'll make a search/use a tool but it never does (because the chain is already finished).

Who can help?

@hwchase17 @agola11

Information

[ ] The official example notebooks/scripts
[X] My own modified scripts

Related Components

[X] LLMs/Chat Models
[ ] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[X] Agents / Agent Executors
[X] Tools / Toolkits
[X] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

This is how the agent is set up

from langchain.chat_models import ChatOpenAI
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.agents import AgentType, initialize_agent
from agent_tools.comparables_tool import ComparablesTool
# from agent_tools.duck_search_tool import duck_search
from langchain.prompts import SystemMessagePromptTemplate, PromptTemplate
from agent_tools.python_repl_tool import PythonREPL
from token_counter import get_token_count
from langchain.prompts import MessagesPlaceholder
from langchain.memory import ConversationBufferMemory

tools = [PythonREPL(), ComparablesTool()]

chat_history = MessagesPlaceholder(variable_name="chat_history")
memory = ConversationBufferMemory(
    memory_key="chat_history", return_messages=True)

gpt = ChatOpenAI(
    temperature=0.2,
    model_name='gpt-3.5-turbo-16k',
    verbose=True
)

conversational_agent = initialize_agent(
    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    tools=tools,
    llm=gpt,
    verbose=True,
    max_iterations=10,
    memory=memory,
    agent_kwargs={
        "memory_prompts": [chat_history],
        "input_variables": ["input", "agent_scratchpad", "chat_history"]
    }
)

async def get_response(user_message: str) -> str:

    return await conversational_agent.arun(user_message)

And this is what's on the terminal:

FerAtTheFringe#1080 said: "Hey I need to find apartments in madrid with at least 3 rooms" (general)
←[1m> Entering new  chain...←[0m
←[32;1m←[1;3mSure! I can help you find apartments in Madrid with at least 3 rooms. Let me search for some options for you.←[0m
←[1m> Finished chain.←[0m

Expected behavior

FerAtTheFringe#1080 said: "Hey I need to find apartments in madrid with at least 3 rooms" (general)
←[1m> Entering new  chain...←[0m
←[32;1m←[1;3m  "action": "get_comparables",
  "action_input": {
    "latitude": "38.9921979",
    "longitude": "-1.878099",
    "rooms": "5",
    "nresults": "10"
  }

okradze commented 1 year ago

I have same issue.

> Entering new  chain...
To list collections for the game "Fortnite", I will use the list_collections_tool. Let's fetch the data.

> Finished chain.

And it never runs tool.

okradze commented 1 year ago

@FerAtTheFringe Did you find any workaround for this problem?

FerAtTheFringe commented 1 year ago

@okradze No, I didn't. I went back to using chat-conversational-react-description and modified my tool to have just one parameter, then I changed the prompt to let the agent know it has to pass an object inside a string so that the tool doesn't detect it's more than one param.

class ComparablesTool(BaseTool):
    name = "get_comparables"
    description = """...
    Action input should always be an object inside a string with mentioned parameters
    """

    def _run(
            self,
            filters_string
    ):
        filters = json.loads(filters_string)

        url = 'URL'
        params = {
            'operation': filters.get('operation', '2'),
            'latitude': filters.get('latitude'),
            'longitude': filters.get('longitude'),
            ...
        }
        headers = {
            'x-api-key': f"{os.getenv('API_KEY')}"
        }
        response = requests.get(url, params=params, headers=headers)
        return response.content

pretidav commented 1 year ago

same problem here. The output is a json but no tool is used.

Wildhammer commented 11 months ago

I have the exact same problem the exact same usage of initialize_agent. The agent properly asks follow-up questions to get the required input for the tool but it never uses the tool when it has collected all the necessary info.

def find_locations(zipcode: str, type_of_cuisine: str) -> str:
    """Find nearests restaurants for type_of_cuisine"""
    return ",".join(["Restaurant 1 at Columbia SC", "Restaurant 2 at Columbia SC"])
find_locations_tool = StructuredTool.from_function(find_locations)

tools = [find_locations_tool]

chat_history = MessagesPlaceholder(variable_name="chat_history")
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
agent = initialize_agent(
    tools, 
    llm, 
    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION, 
    verbose=True,
    memory=memory,
    agent_kwargs = {
        "memory_prompts": [chat_history],
        "input_variables": ["input", "agent_scratchpad", "chat_history"]
    }
)

then starting the conversation:

agent.run(input="I am looking for restaurants in my area")
# 'Sure, I can help you with that. Could you please provide me with your zipcode?'
agent.run(input="29205")
# 'Great! I can find restaurants in your area. What type of cuisine are you interested in?'
agent.run(input="Italian")
# 'Alright! Let me find the nearest Italian restaurants in your area. Please give me a moment.'

You can see how it's asking the right follow-up questions to collect info for using the tool but it never actually uses the tool at the end.

If I change the tool to a simple multiply function (from the documentation) it actually does use the tool:

def multiplier(a: float, b: float) -> float:
    """Multiply the provided floats."""
    return a * b
multiplier_tool = StructuredTool.from_function(multiplier)

... 

agent.run(input="I have a math problem")
# "Sure, I'm here to help! What math problem do you need assistance with?"
agent.run(input="I want to multiply two numbers. The first number is 34")
# 'Got it! And what is the second number you would like to multiply with 34?'
agent.run(input="The second number is 35.23")

  Great! To multiply the numbers 34 and 35.23, you can use the multiplier tool. Here's the action to perform the multiplication:
  {
    "action": "multiplier",
    "action_input": {
      "a": 34,
      "b": 35.23
    }
  }
  Observation: 1197.82
  Thought:The result of multiplying 34 and 35.23 is 1197.82. Is there anything else I can assist you with?

Am I missing something? @hwchase17 @agola11

Appreciate your time in advance

torkian commented 11 months ago

Same problem here.

iwm911 commented 11 months ago

Same here :(

anotherbuginthecode commented 11 months ago

I also ran into this problem and adopted a somewhat crude but effective workaround, I hope it will be useful to you.

Take an example on how I defined my ComparisonTool

class ComparisonTool(BaseTool):

    name = "comparison"
    description = ("""\
        Useful when you need to identify the key differences between two or more items.
        The input should be like
        {{
            Action: comparison,
            Action Input: 
                "index": <the index to use to search the documents>, 
                "query": <the full query>, 
                "language": <language of the query>
        }}
        """)

    return_direct = True

    def _run(self, 
            query:str, 
            index:str, 
            language:str="english"):

        ElasticsearchReader = download_loader("ElasticsearchReader")
        reader = ElasticsearchReader(
            ELASTICSEARCH_URL,
            index,
        )
        query_dict = {"query": {"match": {"text": {"query": query}}}}
        documents = reader.load_data(
            "text", query=query_dict, embedding_field="vector"
        )

        docs_ctx = [f"content:{d.text}\nsource:{d.metadata['metadata']['source']}\n" for d in documents]

        prompt_template = """\
        Your task is to write a list key points and a list of main differences between two or more items.

        Use the context provided between <ctx> tag.

        For each item write a list of key features with pros and cons.

        Based on the previous extractions, compare the two list and return a final list with the main differences between the items.

        Return always the sources where you find the answer using the following format:
        ```filename [the pages where you find the answer]```

        As last task, return only the answer translated in {language}

        CONTEXT:
        <ctx>
        {text}
        <ctx>

        QUESTION: {query}
        ANSWER TRANSLATED IN {language}
        """

        PROMPT = PromptTemplate.from_template(template=prompt_template)

        chat = ChatOpenAI(model='gpt-4', temperature=0.5)

        return chat(PROMPT.format_prompt(text='\n\n'.join(docs_ctx), query=query, language=language).to_messages()).content

    def _arun(self, 
            query:str, 
            index:str, 
            language:str="english"):
        raise NotImplementedError("This tool does not support async")

Then I defined my agent passing PREFIX, FORMAT_INSTRUCTIONS, and SUFFIX to force it to use a specific format that I can parse when the chain exit at the first iteration.

PREFIX = '''Assistant is a large language model trained by OpenAI.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.'''

FORMAT_INSTRUCTIONS = """To use a tool, please use the following format:

\```
Thought: Do I need to use a tool? Yes
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
\```

When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the following format(the prefix of "Thought: " and "AI: " are must be included):

\```
Thought: Do I need to use a tool? No
AI: [your response here]
\```"""

SUFFIX = '''

Begin!

Previous conversation history:
{chat_history}

Instructions: {input}
{agent_scratchpad}
'''

memory = ConversationBufferWindowMemory(
    memory_key='chat_history',
    k=3,
    return_messages=True
)

tools = [ComparisonTool(), Summarization(), QA()]
turbo_llm = ChatOpenAI(model="gpt-4", temperature=0, verbose=True)
conversational_agent = initialize_agent(
    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION, 
    tools=tools, 
    llm=turbo_llm,
    verbose=True,
    max_iterations=3,
    early_stopping_method='generate',
    handle_parsing_errors="Check your output and make sure it conforms!",
    memory=memory,
    agent_kwargs={
        'prefix': PREFIX, 
        'format_instructions': FORMAT_INSTRUCTIONS,
        'suffix': SUFFIX,
        "input_variables": [
            "input",
            "agent_scratchpad",
            "chat_history"
        ],
        "verbose": True
    }
)

Here is my "try_catch" function to handle the problem

def retry_on_fail(agent_output):
    try:
        # this regex will capture the original payload
        payload_pattern =  r'Action Input:\s*({[^}]*})'
        # this regex will match the original action
        action_pattern = r'Action:\s*(.+)'

        # Find the match using re.search
        match_payload = re.search(payload_pattern, agent_output)
        match_action = re.search(action_pattern, agent_output)

        # Check if the pattern is found and extract the captured group
        if match_payload and match_action:
            payload = json.loads(match_payload.group(1))
            action = match_action.group(1)
        else:
            return agent_output

        index = payload['index']
        query = payload['query']
        language = payload['language']

        if action == 'summarize':

            return Summarization()._run(
                query=query, 
                index=index,
                language=language)
        elif  action == 'comparison':
            return ComparisonTool()._run(
                query=query, 
                index=index,
                language=language)
        else:
            return QA()._run(
                query=query, 
                index=index,
                language=language)

    except Exception as e:
        print(e)
        return agent_output

Execution example:

payload = {
    "index": "chatto-esopo-2",
    "query": "Quale è la differenza tra la favola del 'il corvo e la volpe' e 'il leone e il topo'?"
}
answer = conversational_agent.run(json.dumps(payload))
print(retry_on_fail(answer))

Then the output will be:

> Entering new AgentExecutor chain...
Thought: Do I need to use a tool? Yes
Action: comparison
Action Input: {"index": "index-12345", "query": "Quale è la differenza tra la favola del 'il corvo e la volpe' e 'il leone e il topo'?", "language": "italian"}

> Finished chain.

And retry_on_fail(answer) will parse Action and the Action Input payload to identify the correct function and provide the necessary parameters. Giving the expected output:

Principali differenze tra la favola del 'Il corvo e la volpe' e 'Il leone e il topo':
- In 'Il corvo e la volpe', il corvo viene ingannato dalla volpe che lo loda per le sue piume e lo convince a cantare, facendolo cadere nel tranello e facendogli perdere il pezzo di carne. In 'Il leone e il topo', il leone risparmia la vita del topo e successivamente il topo ricambia il favore liberandolo da una rete.
- 'Il corvo e la volpe' tratta di inganno e manipolazione, mentre 'Il leone e il topo' tratta di gratitudine e ricompensa.
- La morale di 'Il corvo e la volpe' è che non bisogna lasciarsi ingannare dalle lusinghe degli altri, mentre la morale di 'Il leone e il topo' è che anche le piccole azioni possono portare a grandi ricompense.
- 'Il corvo e la volpe' è una storia di animali, mentre 'Il leone e il topo' è una storia di animali che interagiscono con gli esseri umani.
- 'Il corvo e la volpe' è una storia più breve e semplice, mentre 'Il leone e il topo' è una storia più lunga e complessa.

Sources:
[1] pag. 2 (favola del 'Il corvo e la volpe')
[2] pag. 2-3 (favola del 'Il leone e il topo')

annieetang commented 10 months ago

I also ran into this problem with the tool not executing... The only workaround I've been able to come up with is also resorting to use the ZERO_SHOT_REACT_DESCRIPTION agent and use string parsing (from this part of the doc) with a regular Tool class. It seems the structured chat zero shot react agent is not reliable enough yet, because sometimes it would execute the tools, but most of the time it wouldn't.

ghost commented 10 months ago

I have the same issue with the STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION. Somewhere between ~8-12 min into an Agent execution the chain will just finish, sometimes halfway through an Observation or Thought. For example:

...

        // Assert that the result is as expected
        assertEquals(responseBody, result, "The result should be the response body");
    }
}
...
Thought:The code has been corrected to create a successful Response object instead of mocking it. Now, I need to save the corrected test class to disk again.

> Finished chain.
Agent Execution took: 818.8932797908783

No errors or exceptions are reported. I have not tried a workaround beyond splitting up the tasks so that Agents complete their tasks within that ~8--12 min limit, which is not always a complete success.

KevKibe commented 8 months ago

Ran into a similar problem but the agent uses the tool when I specify the tool in the prompt e.g "From 'x' data(referring to tool for fetching 'x' data) find out what ..."

JoffreyLemery commented 8 months ago

I have exactly the same issues.

Is someone has an idea about the fact that Structured_chat_zero_shot_ReACT_description will be improved ?

Valdanitooooo commented 7 months ago

Same here :(

JoffreyLemery commented 7 months ago

@Valdanitooooo let me know if you find a workaround effective

langchain-ai / langchain