livekit / agents

Build real-time multimodal AI applications 🤖🎙️📹
https://docs.livekit.io/agents
Apache License 2.0
4.04k stars 421 forks source link

Anthropic parallel tool call issue #1059

Closed jayeshp19 closed 1 week ago

jayeshp19 commented 2 weeks ago

anthropic breaks if there are multiple tool calls,

steps to reproduce:

stream logs:

2024-11-06 21:42:20,352 - DEBUG livekit.agents.pipeline - committed agent speech {"agent_transcript": " Hello from the weather station. Would you like to know the weather? If so, tell me your location.", "interrupted": false, "speech_id": "9c011146ed84"}
2024-11-06 21:42:25,055 - DEBUG livekit.agents.pipeline - received user transcript {"user_transcript": "London and Delhi."}
2024-11-06 21:42:26,158 - DEBUG livekit.agents.pipeline - validated agent reply {"speech_id": "5f174005ba8b"}
2024-11-06 21:42:27,532 - DEBUG livekit.plugins.anthropic - received event RawMessageStartEvent(message=Message(id='msg_013uZ4hUmwy9anaKhr6TLiEU', content=[], model='claude-3-5-sonnet-20241022', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=435, output_tokens=1)), type='message_start')   
2024-11-06 21:42:27,533 - DEBUG livekit.plugins.anthropic - received event RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
2024-11-06 21:42:27,535 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=TextDelta(text='I', type='text_delta'), index=0, type='content_block_delta')
2024-11-06 21:42:27,667 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=TextDelta(text="'ll check the weather for both London and Delhi for", type='text_delta'), index=0, type='content_block_delta')
2024-11-06 21:42:27,857 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=TextDelta(text=' you.', type='text_delta'), index=0, type='content_block_delta')
2024-11-06 21:42:28,185 - DEBUG livekit.plugins.anthropic - received event RawContentBlockStopEvent(index=0, type='content_block_stop') 
2024-11-06 21:42:28,187 - DEBUG livekit.plugins.anthropic - received event RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01XUPbticW5Qpun2wiL8SePc', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
2024-11-06 21:42:28,189 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
2024-11-06 21:42:28,658 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='{"locatio', type='input_json_delta'), index=1, type='content_block_delta')
2024-11-06 21:42:28,660 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='n": "L', type='input_json_delta'), index=1, type='content_block_delta')
2024-11-06 21:42:28,661 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='ondon"}', type='input_json_delta'), index=1, type='content_block_delta')
2024-11-06 21:42:28,662 - DEBUG livekit.plugins.anthropic - received event RawContentBlockStopEvent(index=1, type='content_block_stop')
2024-11-06 21:42:28,664 - DEBUG livekit.plugins.anthropic - received event RawMessageDeltaEvent(delta=Delta(stop_reason='tool_use', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=74))
2024-11-06 21:42:28,668 - DEBUG livekit.plugins.anthropic - received event RawMessageStopEvent(type='message_stop')
2024-11-06 21:42:30,890 - DEBUG livekit.agents.pipeline - speech playout started {"speech_id": "5f174005ba8b"}
2024-11-06 21:42:34,299 - DEBUG livekit.agents.pipeline - speech playout finished {"speech_id": "5f174005ba8b", "interrupted": false}
2024-11-06 21:42:34,299 - DEBUG livekit.agents.pipeline - executing ai function {"function": "get_weather", "speech_id": "5f174005ba8b"}
2024-11-06 21:42:34,300 - INFO weather-demo - getting weather for London
2024-11-06 21:42:36,573 - DEBUG livekit.plugins.anthropic - received event RawMessageStartEvent(message=Message(id='msg_01SpC76mCmGsRkDVeHG8VAFt', content=[], model='claude-3-5-sonnet-20241022', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=526, output_tokens=35)), type='messart')
2024-11-06 21:42:36,576 - DEBUG livekit.plugins.anthropic - received event RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01T3fPNb5Lstu4uX7c17KUEx', input={}, name='get_weather', type='tool_use'), index=0, type='content_block_start')
2024-11-06 21:42:36,577 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='', type='input_json_delta'), index=0, type='content_block_delta')
2024-11-06 21:42:36,853 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='{"', type='input_json_delta'), index=0, type='content_block_delta')
2024-11-06 21:42:36,855 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='loca', type='input_json_delta'), index=0, type='content_block_delta')
2024-11-06 21:42:36,855 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='tion": "De', type='input_json_delta'), index=0, type='content_block_delta')
2024-11-06 21:42:36,861 - DEBUG livekit.plugins.anthropic - received event RawContentBlockDeltaEvent(delta=InputJSONDelta(partial_json='lhi"}', type='input_json_delta'), index=0, type='content_block_delta')
2024-11-06 21:42:37,090 - DEBUG livekit.plugins.anthropic - received event RawContentBlockStopEvent(index=0, type='content_block_stop')
2024-11-06 21:42:37,090 - DEBUG livekit.plugins.anthropic - received event RawMessageDeltaEvent(delta=Delta(stop_reason='tool_use', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=55))
2024-11-06 21:42:37,091 - DEBUG livekit.plugins.anthropic - received event RawMessageStopEvent(type='message_stop')
2024-11-06 21:42:37,092 - DEBUG livekit.agents.pipeline - speech playout finished {"speech_id": "5f174005ba8b", "interrupted": false}
2024-11-06 21:42:37,092 - DEBUG livekit.agents.pipeline - committed agent speech {"agent_transcript": "", "interrupted": false, "speech_id": "5f174005ba8b"}        

CC : @theomonnom @davidzhao

jayeshp19 commented 1 week ago

It seems there's bug on anthropic side. we have to explicitly specify in the system prompt to use parallel tool calls.

Resolution

  1. use latest models provided by Anthropic
  2. explicitly specify in the system prompt to use multiple tool calls

Example

result = anthropic.Client(api_key=api_key).messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    system="You are a weather assistant. You will provide weather information for a given location. You can use multiple tool calls at once.",
    messages=messages,
    tools=tools,
    tool_choice={
        "type": "auto",
        "disable_parallel_tool_use": False
    },
)

gives

Message(id='msg_01XM7G2QLkqjcjBvbFms2Z3i', content=[TextBlock(text="I'll check the weather for both London and New York for you.", type='text')], 
ToolUseBlock(id='toolu_01THRx1ZePpp0tadzPGQG74', input={'location': 'London'}, name='get_weather', type='tool_use'), 
ToolUseBlock(id='toolu_01Tt6GahcRDXeKXedH2YHr9', input={'location': 'New York'}, name='get_weather', type='tool_use'),
model='claude-3-5-sonnet-20241022', role='assistant', stop_reason='tool_use', stop_sequence=None, type='message', usage=Usage(input_tokens=432, output_tokens=105))

cc: @davidzhao

davidzhao commented 1 week ago

Thanks @jayeshp19 !