Closed tobegit3hub closed 3 days ago
Use the OpenAI compatibility endpoint.
Use the OpenAI compatibility endpoint.
Thanks for replying. Actually we are using this compatibility endpoint. But the response is not compatible with official OpenAI API which is also supported by vllm
. Here is the expected response data. Refer to https://platform.openai.com/docs/api-reference/streaming .
What is incompatible about the ollama response?
What is incompatible about the ollama response?
The expected response should be like this.
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677858242,
"model": "gpt-4o-mini",
"usage": {
"prompt_tokens": 13,
"completion_tokens": 7,
"total_tokens": 20,
"completion_tokens_details": {
"reasoning_tokens": 0
}
},
"choices": [
{
"message": {
"role": "assistant",
"content": "\n\nThis is a test!"
},
"logprobs": null,
"finish_reason": "stop",
"index": 0
}
]
}
But the actual response from ollama is like this.
{
"model": "codellama:code",
"created_at": "2024-07-22T20:47:51.147561Z",
"response": "\n if a == 0:\n return b\n else:\n return compute_gcd(b % a, a)\n\ndef compute_lcm(a, b):\n result = (a * b) / compute_gcd(a, b)\n",
"done": true,
"done_reason": "stop",
"context": [...],
"total_duration": 1162761250,
"load_duration": 6683708,
"prompt_eval_count": 17,
"prompt_eval_duration": 201222000,
"eval_count": 63,
"eval_duration": 953997000
}
The streamming mode has the same issue as well.
Our applicaitons will use public model service which is completely compatiable with OpenAI API and ollama model services. The difference of their response require extra work for applications to handle.
That response is from the ollama API endpoint, /api/generate
. Use the OpenAI compatibility endpoint, /v1/chat/completions
.
That response is from the ollama API endpoint,
/api/generate
. Use the OpenAI compatibility endpoint,/v1/chat/completions
.
Thanks and you are right. We are using the incoreect API and the /v1/chat/completions
works like a charm. Thanks and I will close this issue.
Refer to the API docs in https://github.com/ollama/ollama/blob/main/docs/api.md , currently the response data format is not compatible with OpenAI API.
It is import to be compatible with OpenAI API for not only the request data but also the response data. Is there any plan to make changes for response data?