letta-ai / letta

Letta (formerly MemGPT) is a framework for creating LLM services with memory.
https://letta.com
Apache License 2.0
12.88k stars 1.41k forks source link

Unpacking inner thoughts from more than one tool call (2) is not supported #2071

Open victorserbu2709 opened 1 day ago

victorserbu2709 commented 1 day ago

Is your feature request related to a problem? Please describe. Hello. I tried to use letta with vllm serving qwen2.5 72B model. It returned 2 tools and letta doesn't support this

Response status code: 200
Response JSON: {'id': 'chatcmpl-394567d21b574a9f9217b1c0e3a05dcc', 'object': 'chat.completion', 'created': 1732114754, 'model': 'Qwen2.5 72B', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': None, 'tool_calls': [{'id': 'chatcmpl-tool-e63a076a83ed4f41843f711a2f1cc9aa', 'type': 'function', 'function': {'name': 'send_message', 'arguments': '{"message": "Found issues #2 and #3. Fetching details now.", "inner_thoughts": "Issues fetched. Now to get the details of each one."}'}}, {'id': 'chatcmpl-tool-20c56801bf534f5a837c0387452034d7', 'type': 'function', 'function': {'name': 'get_issue_tool', 'arguments': '{"repo_name": "org/zen_test", "issue_number": 2, "request_heartbeat": true, "inner_thoughts": "Fetching details for issue #2."}'}}]}, 'logprobs': None, 'finish_reason': 'tool_calls', 'stop_reason': None}], 'usage': {'prompt_tokens': 3729, 'total_tokens': 3835, 'completion_tokens': 106, 'prompt_tokens_details': None}, 'prompt_logprobs': None}
/root/stash/git/letta/letta/llm_api/helpers.py:161: UserWarning: Unpacking inner thoughts from more than one tool call (2) is not supported
  warnings.warn(f"Unpacking inner thoughts from more than one tool call ({len(message.tool_calls)}) is not supported")
>1 tool call not supported, using index=0 only

Describe the solution you'd like To work

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

victorserbu2709 commented 1 day ago

From https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#api-reference

Note: parallel_tool_calls and user parameters are ignored.

sarahwooders commented 1 day ago

Sorry I'm a bit confused - do you expect vLLM to return multiple tool calls? Unfortunately letta doesn't support parallel tool calling at the moment.