BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
13.85k stars 1.63k forks source link

[Bug]: azure streaming returning delta + message chunk at the end of stream (and beginning of stream in v.11.1) #1081

Closed krrishdholakia closed 11 months ago

krrishdholakia commented 11 months ago

What happened?

Code for repro:

from litellm import completion
import os

## set ENV variables
os.environ["AZURE_API_KEY"] = ""
os.environ["AZURE_API_BASE"] = ""
os.environ["AZURE_API_VERSION"] = ""

# azure call
response = completion(
  "azure/chatgpt-v-2", 
  messages = [{ "content": "Hello, how are you?","role": "user"}],
  stream=True,
)

for chunk in response: 
  print(chunk)

Relevant log output

Last chunk for azure streaming: `ModelResponse(id='chatcmpl-17c4885d-9eef-480c-a936-f550042bc9ae', choices=[Choices(delta=Delta(content=None, role=None), finish_reason='stop', index=0, message=Message(content=None, role='assistant'))], created=1702306161, model='chatgpt-v-2', object='chat.completion.chunk', system_fingerprint=None, usage=Usage())`

First chunk for azure streaming: `ModelResponse(id='chatcmpl-8UbsuJEqi5aJvE4je4LaTjm3fBd93', choices=[Choices(delta=Delta(tool_calls=None, function_call=None, content='As', role='assistant'), finish_reason=None, index=0, message=Message(content=None, role='assistant'))], created=1702306161, model='chatgpt-v-2', object='chat.completion.chunk', system_fingerprint=None, usage=Usage())`

Twitter / LinkedIn details

No response

krrishdholakia commented 11 months ago

this is fixed