openvinotoolkit / model_server

A scalable inference server for models optimized with OpenVINO™
https://docs.openvino.ai/2024/ovms_what_is_openvino_model_server.html
Apache License 2.0
676 stars 212 forks source link

LLM Serving Incompatible with OPENAI API: /v1/chat/completions #2760

Open alexgang opened 1 month ago

alexgang commented 1 month ago

Describe the bug OpenAI API endpoint is "/v1/chat/completions", but OVMS endpoint is "/v3/chat/completions". most of existing application doesn't allow user to modify the prefix “V1” to "V3", hence the OVMS wont works with those OpenAI compatible applications?

To Reproduce using below "QuickStart" demo as example with minor modification: https://docs.openvino.ai/nightly/openvino-workflow/model-server/ovms_docs_llm_quickstart.html in step6, just modify the 1st line of demo script as below: curl -s http://localhost:8000/v1/chat/completions \

Expected behavior { "choices": [ { "finish_reason": "stop", "index": 0, "logprobs": null, "message": { "content": "OpenVINO is a software toolkit developed by Intel that enables developers to accelerate the training and deployment of deep learning models on Intel hardware.", "role": "assistant" } } ], "created": 1718607923, "model": "TinyLlama/TinyLlama-1.1B-Chat-v1.0", "object": "chat.completion" }

Logs { “error”: "Invalid request URL" }

Configuration

  1. OVMS version 2024.4.28219825c
  2. OVMS config.json file same in https://docs.openvino.ai/nightly/openvino-workflow/model-server/ovms_docs_llm_quickstart.html
  3. CPU, accelerator's versions if applicable MTL CPU
  4. Model repository directory structure same in https://docs.openvino.ai/nightly/openvino-workflow/model-server/ovms_docs_llm_quickstart.html
  5. Model or publicly available similar model that reproduces the issue same in https://docs.openvino.ai/nightly/openvino-workflow/model-server/ovms_docs_llm_quickstart.html Additional context Add any other context about the problem here.
atobiszei commented 1 month ago

Can you write where you could not change the version prefix?

alexgang commented 4 weeks ago

@atobiszei for example: ChatBox application: (see URL in below) https://github.com/Bin-Huang/chatbox/releases it only allows user to modify the web address but not path or version prefix, image

there are also other applications/plugins . like ChatGPT Sidebar or one-api.