[Bug]: Chain-of-Abstraction Failed at very simple request due to parsing in the output_parser

ryann-sportsbet commented 1 month ago

Bug Description

Description:

The Chain-of-Abstraction (CoA) agent is failing to correctly execute a very simple task. When provided with basic mathematical tools (addition and multiplication) and asked to perform an initial calculation followed by doubling the result, the agent's performance is suboptimal.

Observed Behavior:

The agent successfully outputs a reasonable plan for the task.

First, we need to calculate the result of 4 + 6. This can be done using theadd` function: [FUNC add(4, 6) = y1]

Next, we need to multiply the result of the first calculation (y1) by 100 to find out how much money you will have. This can be done using the multiply function: [FUNC multiply(y1, 100) = y2]

Therefore, the final result is y2. `

It seems like the inputs = json.loads("[" + inputs + "]") inside output_parser cause the issue:

https://github.com/run-llama/llama_index/blob/a620a2661faabb49ba2f257bff7ae2ac04d0c12b/llama-index-integrations/agent/llama-index-agent-coa/llama_index/agent/coa/output_parser.py

I print the input and here is what I see

inputs: 4, 6 inputs: y1, 100

It can parse the first input but cannot do for the second one. C

Version

0.11.14

Steps to Reproduce

` from llama_index.agent.coa import CoAAgentWorker from llama_index.core.tools import FunctionTool

def add(a: int, b: int) -> int: """ Add two numbers """ return a + b

def multiply(a: int, b: int) -> int: """ Multiply two numbers """ return a * b

tools = [FunctionTool.from_defaults(add, name='add', description='Add two numbers'), FunctionTool.from_defaults(multiply, name='multiply', description='Multiply two numbers')]

agent_worker = CoAAgentWorker.from_tools( tools=tools, llm=llm_haiku, verbose=True, allow_parallel_tool_calls=True )

agent = agent_worker.as_agent() response = agent.chat("what is 4+6 and if I have 100 dollars how much will I have if I multiply it from the result of the first calculation") `

Relevant Logs/Tracbacks


  File "/Users/ryann/work/apollo-repo/pd-apollo-poc/.venv/lib/python3.12/site-packages/llama_index/agent/coa/step.py", line 205, in _arun_step
    parsed_response, tool_sources = await self.output_parser.aparse(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ryann/work/apollo-repo/pd-apollo-poc/.venv/lib/python3.12/site-packages/llama_index/agent/coa/output_parser.py", line 46, in aparse
    inputs = json.loads("[" + inputs + "]")
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

dosubot[bot] commented 1 month ago

The ChainOfAbstractionParser fails to parse the second input in the output_parser.py file, resulting in a JSONDecodeError because the inputs string extracted from the function calls is not properly formatted as a JSON array. The code attempts to load the inputs using json.loads("[" + inputs + "]"), which can fail if inputs is not a valid JSON string.

Here is the relevant code snippet:

func_calls = re.findall(r"\[FUNC (\w+)\((.*?)\) = (\w+)\]", solution)

# ...

for func_name, inputs, output in func_calls:
    inputs = json.loads("[" + inputs + "]")  # This line can cause JSONDecodeError

If inputs contains invalid JSON, json.loads will raise a JSONDecodeError. To fix this, ensure that inputs is a valid JSON string before attempting to parse it. Additionally, the ChainOfAbstractionParser class is designed to parse outputs using a specific prompt format defined in prompts.py. If the prompt formatting changes, the parser may not work correctly and should be updated accordingly ^[1]^[2].

To continue talking to Dosu, mention @dosu.

ryann-sportsbet commented 1 month ago

@dosubot It is able to parse the first inputs: 4, 6 but not the second input: y1, 100

The original prompt remains the same, the y1 is the structure output of original prompt

dosubot[bot] commented 1 month ago

The ChainOfAbstractionParser is able to parse the first inputs 4, 6 but not the second input y1, 100 because the parser relies on extracting function calls and placeholders from the solution string using regular expressions. The function calls are expected to be in the format [FUNC function_name(inputs) = output].

The parser uses the following regular expression to extract function calls:

func_calls = re.findall(r"\[FUNC (\w+)\((.*?)\) = (\w+)\]", solution)

This regex captures function calls that match the pattern [FUNC function_name(inputs) = output]. The inputs are then parsed as JSON arrays:

inputs = json.loads("[" + inputs + "]")

For the first inputs 4, 6, if they are part of a function call like [FUNC some_function(4, 6) = y1], the parser can correctly extract and process them.

However, for the second input y1, 100, if y1 is a structured output of the original prompt, it must be part of a function call in the same expected format. If the function call involving y1 does not match the expected pattern or if y1 is not correctly formatted as an input in the solution string, the parser will not be able to extract and process it correctly.

Additionally, the parser creates a dependency graph to manage the execution order of functions based on their dependencies. If y1 is not correctly identified as a node in this graph or if its dependencies are not properly resolved, the parser will fail to process it. Therefore, the issue likely arises because the second input y1, 100 does not conform to the expected function call format or is not correctly structured in the solution string, leading to a failure in the dependency graph creation or execution level determination ^[1]^[2].

To continue talking to Dosu, mention @dosu.

ryann-sportsbet commented 1 month ago

@dosu but y1 is the placeholder and the result of the add function. It must execute this function to get the result then use this result for the multi function

dosubot[bot] commented 1 month ago

The ChainOfAbstractionParser is unable to parse the second input y1, 100 because the inputs are being parsed using json.loads("[" + inputs + "]"). This method expects the inputs to be in a valid JSON format, which means that each input should be enclosed in double quotes if they are strings.

In the case of y1, 100, y1 is not enclosed in double quotes, causing json.loads to fail. The parser expects something like "y1", 100 instead.

Here is the relevant code snippet:

for func_name, inputs, output in func_calls:
    inputs = json.loads("[" + inputs + "]")
    graph.add_node(output, func_name=func_name, inputs=inputs)
    for inp in inputs:
        graph.add_edge(inp, output)

To fix this issue, ensure that the inputs are in a valid JSON format before they are parsed. Additionally, the parser should replace placeholders in the solution text with their corresponding values after execution, as shown in the following snippet:

# Replace placeholders in the solution text
for placeholder, value in results.items():
    solution = solution.replace(f"{placeholder}", '"' + str(value) + '"')

return solution, tool_outputs

This ensures that placeholders like y1 are correctly replaced with their computed values before being used as inputs for subsequent function calls ^[1]^[2].

To continue talking to Dosu, mention @dosu.

ryann-sportsbet commented 1 month ago

@logan-markewich do you have any intel of this error and potential solution or workaround ?

logan-markewich commented 1 month ago

@ryann-sportsbet haiku is probably not smart enough to follow the prompt perfectly.

The output needs to be passable, so if it's missing the brackets, there's not much we can do

Try using sonnet

ryann-sportsbet commented 1 month ago

@logan-markewich thank you I've tried with sonnet and have the same issue

logan-markewich commented 1 month ago

Seems like anthropic just isn't good with the approach -- probably takes some prompt engineering (and maybe modifying the parsing step) to get this to work well with anthropic 🤷🏻

run-llama / llama_index