mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
24.23k stars 1.86k forks source link

Streaming messages not compliant with openAI spec #1194

Open Karrq opened 1 year ago

Karrq commented 1 year ago

LocalAI version: 274ace2

Describe the bug When enabling the stream API each choice completion object is missing the finish_reason property except for the last message

To Reproduce: Example response from LocalAI:

data: {"object":"chat.completion.chunk","model":"ggml-gpt4all-j","choices":[{"delta":{"role":"assistant"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"chat.completion.chunk","model":"ggml-gpt4all-j","choices":[{"delta":{"content":"1"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"chat.completion.chunk","model":"ggml-gpt4all-j","choices":[{"delta":{"content":" 2"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"chat.completion.chunk","model":"ggml-gpt4all-j","choices":[{"delta":{"content":" 3"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

Expected behavior Each message should contain the finish_reason property set to null instead of omitting it entirely

Example:

data: {"id":"chatcmpl-blablabla","object":"chat.completion.chunk","created":1684665888,"model":"gpt-4-0314","choices":[{"delta":{"role":"assistant"},"index":0,"finish_reason":null}]}

data: {"id":"chatcmpl-blablabla","object":"chat.completion.chunk","created":1684665888,"model":"gpt-4-0314","choices":[{"delta":{"content":"1"},"index":0,"finish_reason":null}]}

data: {"id":"chatcmpl-blablabla","object":"chat.completion.chunk","created":1684665888,"model":"gpt-4-0314","choices":[{"delta":{"content":" "},"index":0,"finish_reason":null}]}

data: {"id":"chatcmpl-blablabla","object":"chat.completion.chunk","created":1684665888,"model":"gpt-4-0314","choices":[{"delta":{"content":"2"},"index":0,"finish_reason":null}]}

Additional context Previously partially addressed by #341

jespino commented 1 year ago

I think this is a good first issue for somebody, sounds like relatively straight forward to be solved.

localai-bot commented 1 year ago

:warning::warning::warning::warning::warning:

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me, I don't support (yet) replying!

:warning::warning::warning::warning::warning:

I see that the issue is related to the stream-api and the LocalAI version is 274ace2. To reproduce the bug, you can enable the stream API and send messages with the prompt "1 2 3". The response should contain a JSON object for each message, where the choices array contains an object for each choice. Each object should have a delta property with a content field and a finish_reason field set to null. However, it seems that the finish_reason field is missing except for the last message. To fix this issue, you can check the code related to the stream API and the completion function in the LocalAI project. You may need to update the code to ensure that the finish_reason field is set correctly for each choice object. Additionally, you can try reproducing the bug with a different version of LocalAI or a different model to see if the issue persists.

Sources:

LeonSijiaLu commented 6 months ago

Hello, I wasn't able to replicate this issue on my end. I made an http call to the v1/chat/completions endpoint, below is the request I sent.

{
     "model": "gpt-4",
     "messages": [{"role": "user", "content": "How are you?"}],
     "temperature": 0.9,
     "stream": true
}

I also attached the responses I received which do contain finish_reason:"" for all,

the last one

{
    "created": 1714529246,
    "object": "chat.completion.chunk",
    "id": "6b2eb69d-eb26-41a6-a2be-eecaa1f663a2",
    "model": "gpt-4",
    "choices": [
        {
            "index": 0,
            "finish_reason": "stop",
            "delta": {
                "content": ""
            }
        }
    ],
    "usage": {
        "prompt_tokens": 0,
        "completion_tokens": 0,
        "total_tokens": 0
    }
}

the one before the last one

{
    "created": 1714529246,
    "object": "chat.completion.chunk",
    "id": "6b2eb69d-eb26-41a6-a2be-eecaa1f663a2",
    "model": "gpt-4",
    "choices": [
        {
            "index": 0,
            "finish_reason": "",
            "delta": {
                "content": "?"
            }
        }
    ],
    "usage": {
        "prompt_tokens": 0,
        "completion_tokens": 0,
        "total_tokens": 0
    }
}

Could this be due to variations among different receivers? I'm wondering maybe some terminals remove empty fields unknowingly.