run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.91k stars 5.28k forks source link

[Bug]: `response_synthesizer` should allow to set `tool_choice` #15823

Open matthiaskern opened 2 months ago

matthiaskern commented 2 months ago

Bug Description

We use a response synthesizer that returns a pydantic class:

    response_synthesizer = get_response_synthesizer(
        llm=llm,
        output_cls=output_cls,
    )

The LLM (anthropic 3.5 sonnet) seems to sometimes also return reasoning as text as well as the pydantic object. When this is the case, the astructured_predict call fails.

I believe one fix for this could be to be able to. set tool_choice (https://docs.anthropic.com/en/docs/build-with-claude/tool-use).

Version

0.10.63

Steps to Reproduce

Relevant Logs/Tracbacks

File "/usr/local/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 290, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/core/base/base_query_engine.py", line 64, in aquery
    query_result = await self._aquery(str_or_query_bundle)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 290, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/core/query_engine/multistep_query_engine.py", line 104, in _aquery
    final_response = await self._response_synthesizer.asynthesize(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 290, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/core/response_synthesizers/base.py", line 316, in asynthesize
    response_str = await self.aget_response(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 290, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/core/response_synthesizers/tree_summarize.py", line 99, in aget_response
    response = await self._llm.astructured_predict(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 290, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/llama_index/core/llms/llm.py", line 401, in astructured_predict
    dispatcher.event(LLMStructuredPredictEndEvent(output=result))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
    raise validation_error
pydantic.error_wrappers.ValidationError: 1 validation error for LLMStructuredPredictEndEvent
output
  value is not a valid dict (type=type_error.dict)
dosubot[bot] commented 2 months ago

Hey @matthiaskern! I'm here to assist you with any bugs, questions, or contributions you have. Let's work together to solve the issue you're facing.

To address the issue with the response_synthesizer and the astructured_predict call failure when using an Anthropic LLM with function calling, you need to ensure that the tool_choice is properly set within the get_tool_calls_from_response method. This method is responsible for extracting tool calls from the response and should be implemented to handle the tool_choice.

Here is an example of how you can modify the FunctionCallingLLM class to set tool_choice:

class FunctionCallingLLM(LLM):
    # ... other methods ...

    def get_tool_calls_from_response(
        self,
        response: ChatResponse,
        error_on_no_tool_call: bool = True,
        **kwargs: Any,
    ) -> List[ToolSelection]:
        """Predict and call the tool."""
        tool_calls = response.message.additional_kwargs.get("tool_calls", [])

        if len(tool_calls) < 1:
            if error_on_no_tool_call:
                raise ValueError(
                    f"Expected at least one tool call, but got {len(tool_calls)} tool calls."
                )
            else:
                return []

        tool_selections = []
        for tool_call in tool_calls:
            if (
                "input" not in tool_call
                or "id" not in tool_call
                or "name" not in tool_call
            ):
                raise ValueError("Invalid tool call.")
            if tool_call["type"] != "tool_use":
                raise ValueError("Invalid tool type. Unsupported by Anthropic")
            argument_dict = (
                json.loads(tool_call["input"])
                if isinstance(tool_call["input"], str)
                else tool_call["input"]
            )
            tool_selections.append(
                ToolSelection(
                    tool_id=tool_call["id"],
                    tool_name=tool_call["name"],
                    tool_kwargs=argument_dict,
                )
            )

        return tool_selections

    # ... other methods ...

This implementation ensures that the tool_choice is set correctly, which should help in fixing the astructured_predict call failure. The method also includes validation to ensure that the tool calls are properly formatted and of the correct type, which is necessary for compatibility with the Anthropic LLM [1].

Additionally, the current implementation of the get_response_synthesizer function does not support setting tool_choice. You may need to extend this function to include the tool_choice parameter if required [2].

To continue talking to Dosu, mention @dosu.

logan-markewich commented 2 months ago

@matthiaskern Actually, this method is slightly old, I would recommend using

sllm = llm.as_structured_llm(output_cls)
query_engine = index.as_query_engine(llm=llm)

This means every time the LLM runs, it calls the output class

matthiaskern commented 2 months ago

thanks @logan-markewich ! this does simplify our code quite a bit. however as far as I understand, I would still run into the same issue, right? Is there a way to set tool_choice with this approach?

logan-markewich commented 2 months ago

Yea, no way to easily pass it currently

There is this PR that would just force it https://github.com/run-llama/llama_index/pull/15706