datastax / astra-assistants-api

Drop in replacement for the OpenAI Assistants API
Apache License 2.0
142 stars 17 forks source link

ollama completions stream when streaming is set to false #51

Closed phact closed 3 months ago

phact commented 3 months ago

We've been testing running assistants with ollama. This is of interest for some folks using agency-swarm with assistants but may be interesting to others as well (assistants-api with local models).

Currently this endpoint seems to stream even though the stream argument is not set to true:

response = client.chat.completions.create(
    model="ollama/phi3",
    messages=[{"role": "user", "content": "respond in 20 words who are you"}]
)

sample response:

ChatCompletion(id='chatcmpl-b6dc85b5-ecb4-4c20-bc74-7006aee33f15', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='assistant are', role='assistant', function_call=None, tool_calls=None))], created=1719329075, model='ollama/dolphin-llama3', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=3, prompt_tokens=10, total_tokens=13))

Related discord chat here: https://discord.com/channels/1245465949679915008/1245478014737977344/1255181994317451346

phact commented 3 months ago

fixed in v0.2.11