AmineDiro / cria

OpenAI compatible API for serving LLAMA-2 model
MIT License
215 stars 13 forks source link

Support for Chat streaming #20

Closed AmineDiro closed 1 year ago

AmineDiro commented 1 year ago

fixes #19.

Example using OpenAPI :

import openai

openai.organization = "YOUR_ORG_ID"
openai.api_key = "test"
openai.api_base = "http://localhost:3000/v1"

completion = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    max_tokens=512,
    stream=True,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user",
            "content": "Hello! Can you give informations about morocco ? ",
        },
    ],
)

# Streaming
for chunk in completion:
    print(chunk.choices[0].delta)

Result

{
  "role": "assistant",
  "content": " Of"
}
{
  "role": "assistant",
  "content": " course"
}
{
  "role": "assistant",
  "content": ","
}
{
  "role": "assistant",
  "content": " I"
}
{
  "role": "assistant",
  "content": "'"
}
{
  "role": "assistant",
  "content": "d"
}
{
  "role": "assistant",
  "content": " be"
}
...