neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

Let OpenAI ChatCompletionRequest accept List[Dict] messages #1497

Closed mgoin closed 8 months ago

mgoin commented 8 months ago

The OpenAI v1/chat/completions endpoint would fail to properly parse a request.messages of the List[Dict] format

Test

Server command:

deepsparse.server --integration openai --task text-generation --model_path hf:mgoin/TinyStories-1M-ds

Client command:

curl http://localhost:5543/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dummy" \
  -d '{
    "model": "hf:mgoin/TinyStories-1M-ds",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

Before:

{"object":"error","message":"Message role not recognized","type":"invalid_request_error","param":null,"code":null}

After:

{"id":"cmpl-e3bd5431cc384dab8782254ac0a2f862","object":"chat.completion","created":1703185938,"model":"hf:mgoin/TinyStories-1M-ds","choices":[{"message":{"role":"assistant","content":" \"Welcome to the best friends! I have a special place to you. I"},"finish_reason":"length"}],"usage":{"prompt_tokens":453,"total_tokens":469,"completion_tokens":16}}