runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
195 stars 65 forks source link

OpenAI API: API errors have wrong HTTP code #57

Open lucasavila00 opened 4 months ago

lucasavila00 commented 4 months ago

Using a model that does not exist returns HTTP status 200, but the error message is in the JSON

alpayariyak commented 4 months ago

Are you using the endpoint regularly or through openai compatibility?

alpayariyak commented 3 months ago

@lucasavila00

lucasavila00 commented 3 months ago

Open AI Compatibility:

$ curl -i https://api.runpod.ai/v2/yyyyy/openai/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer xxxxxx" -d '{
"model": "a model that does not exist",
"messages": [
  {
    "role": "user",
    "content": "Why is RunPod the best platform?"
  }
],
"temperature": 0,
"max_tokens": 100
}'

HTTP/2 200 
date: Thu, 21 Mar 2024 01:29:58 GMT
content-type: application/json; charset=utf-8
content-length: 133
cf-cache-status: DYNAMIC
set-cookie: __cflb=zzzzz; SameSite=None; Secure; path=/; expires=Fri, 22-Mar-24 00:29:58 GMT; HttpOnly
server: cloudflare
cf-ray: uuuu-GRU

{"code":404,"message":"The model `a model that does not exist` does not exist.","object":"error","param":null,"type":"NotFoundError"}

Notice the HTTP/2 200