Describe alternatives you've considered
letta configured with vllm endpoint type works but it doesn't support streaming.
I tried configured letta with openai endpoint type
messages = [Message(id='message-56f62ff2-55cb-4d77-8efe-abe5644e0286', role=<MessageRole.user: 'user'>, text='{\n "type": "user_message",\n "message": "test",\n "time": "2024-11-13 01:36:41 PM UTC+0000"\n}', user_id='user-00000000-0000-4000-8000-000000000000', agent_id='agent-4eed09c6-7911-45be-840a-73d4b7cd696b', model=None, name='human', created_at=datetime.datetime(2024, 11, 13, 13, 36, 41, 406980, tzinfo=datetime.timezone.utc), tool_calls=None, tool_call_id=None)]
error = 1 validation error for ChatCompletionChunkResponse
choices.0.delta.tool_calls.0.function.arguments
Field required [type=missing, input_value={'name': 'conversation_search'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
step() failed with an unrecognized exception: '1 validation error for ChatCompletionChunkResponse
choices.0.delta.tool_calls.0.function.arguments
Field required [type=missing, input_value={'name': 'conversation_search'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing'
Letta.letta.server.server - ERROR - Error in server._step: 1 validation error for ChatCompletionChunkResponse
choices.0.delta.tool_calls.0.function.arguments
Field required [type=missing, input_value={'name': 'conversation_search'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
Traceback (most recent call last):
File "/root/stash/git/letta/letta/server/server.py", line 448, in _step
usage_stats = letta_agent.step(
^^^^^^^^^^^^^^^^^
File "/root/stash/git/letta/letta/agent.py", line 825, in step
step_response = self.inner_step(
^^^^^^^^^^^^^^^^
File "/root/stash/git/letta/letta/agent.py", line 1034, in inner_step
raise e
File "/root/stash/git/letta/letta/agent.py", line 950, in inner_step
response = self._get_ai_reply(
^^^^^^^^^^^^^^^^^^^
File "/root/stash/git/letta/letta/agent.py", line 568, in _get_ai_reply
raise e
File "/root/stash/git/letta/letta/agent.py", line 531, in _get_ai_reply
response = create(
^^^^^^^
File "/root/stash/git/letta/letta/llm_api/llm_api_tools.py", line 97, in wrapper
raise e
File "/root/stash/git/letta/letta/llm_api/llm_api_tools.py", line 66, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/root/stash/git/letta/letta/llm_api/llm_api_tools.py", line 148, in create
response = openai_chat_completions_process_stream(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/stash/git/letta/letta/llm_api/openai.py", line 354, in openai_chat_completions_process_stream
raise e
File "/root/stash/git/letta/letta/llm_api/openai.py", line 247, in openai_chat_completions_process_stream
for chunk_idx, chat_completion_chunk in enumerate(
File "/root/stash/git/letta/letta/llm_api/openai.py", line 455, in _sse_post
raise e
File "/root/stash/git/letta/letta/llm_api/openai.py", line 420, in _sse_post
chunk_object = ChatCompletionChunkResponse(**chunk_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/stash/git/letta/.venv/lib64/python3.12/site-packages/pydantic/main.py", line 212, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for ChatCompletionChunkResponse
choices.0.delta.tool_calls.0.function.arguments
Field required [type=missing, input_value={'name': 'conversation_search'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
None
/root/stash/git/letta/letta/server/rest_api/utils.py:64: UserWarning: Error getting usage data: 1 validation error for ChatCompletionChunkResponse
choices.0.delta.tool_calls.0.function.arguments
Field required [type=missing, input_value={'name': 'conversation_search'}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
warnings.warn(f"Error getting usage data: {e}")
Is your feature request related to a problem? Please describe. I would like to use vllm server with streaming support. they say that tools output is openai compatible, see: https://github.com/vercel/ai/issues/2231 https://docs.vllm.ai/en/v0.6.3/serving/openai_compatible_server.html#tool-calling-in-the-chat-completion-api
Describe the solution you'd like It should work
Describe alternatives you've considered letta configured with vllm endpoint type works but it doesn't support streaming. I tried configured letta with openai endpoint type
commented strict:true in
but then i receive
This is the response from vllm
if I let letta configured with vllm configured as openai endpoint type and disable streaming it works