Confusing function calling with `AnthropicChatModel`

mnicstruwig commented 5 months ago

Hi @jackmpcollins 👋 ,

I'm running into a weird issue with the AnthropicChatModel. I'm unsure how to capture function calls that occur when the model also outputs text inside of <thinking></thinking> tags (which Anthropic do in order to use chain-of-thought style prompting to improve accuracy with function calls).

How do I get access to the underlying FunctionCall when both text and function calling output is provided by the LLM?

The following example illustrates what I mean:

from magentic import AsyncParallelFunctionCall, AsyncStreamedStr, prompt
from magentic.chat_model.anthropic_chat_model import AnthropicChatModel

def get_weather(city: str) -> str:
    return f"The weather in {city} is 20°C."

@prompt(
    "What is the weather in Cape town and San Francisco?",
    functions=[get_weather],
    model=AnthropicChatModel(
        model="claude-3-opus-20240229",
        temperature=0.2,
    )
)
async def _llm() -> AsyncParallelFunctionCall | AsyncStreamedStr: ...

response = await _llm()
async for chunk in response:
    print(chunk, end="", flush=True)

Which produces the following output:

<thinking>
The user has requested the weather for two cities: Cape Town and San Francisco. 

The get_weather tool is relevant for answering this request. It requires a "city" parameter.

The user has directly provided the names of two cities in their request: "Cape town" and "San Francisco". 

Since the get_weather tool only takes a single city as input, we will need to call it twice, once for each city.

No other tools are needed, as the get_weather tool directly provides the requested information.

All required parameters are available to make the necessary tool calls.
</thinking>

But no function call.

If I only type decorate with the FunctionCall, then the function call is returned. But I don't want to force the LLM into a function call if it isn't necessary.

Thanks!

jackmpcollins commented 5 months ago

With https://github.com/jackmpcollins/magentic/releases/tag/v0.24.0 or earlier this should be working as expected because the response is not streamed so can be viewed in full when parsing. But with the new streaming approach it breaks as you describe because based on the first chunks this looks like a string response.

It looks like the <thinking> section is always present in the response when tools are provided, so it could simply be skipped by magentic to get to the actual answer. I started on a PR for this https://github.com/jackmpcollins/magentic/pull/226 - it would be great if you can test that out.

In future maybe a way to expose this is to allow AssistantMessage to be used as a return type in prompt-functions, and then have an AnthropicAssistantMessage subclass of that with an additional thinking: str attribute.

mnicstruwig commented 5 months ago

I like the idea of extending AssistantMessage with AnthropicAssistantMessage, since it seems to be something only Anthropic currently does. Perhaps it'll become more commonplace in the future (or perhaps omitted from the response entirely).

I'll give some feedback on #226 once I'm able to try it out (should be the next few days). Thanks for the fast turnaround!

jackmpcollins / magentic

Confusing function calling with `AnthropicChatModel` #220