Open samos123 opened 1 month ago
Can you get the benchmark to log http requests?
I got the request info and it turned out that we crap out when this is the URL:
DEBUG:aiohttp.client:Starting request <TraceRequestStartParams(method='POST', url=URL('http://localhost:8000/openai//v1/completions'), headers=<CIMultiDict('Authorization': 'Bearer None')>)>
Notice the extra slash after openai
I will see if I can reproduce in a test case.
I do think we need to fix this btw. We should simply strip an extra /
in the request url so others don't hit similar issues. It's very easy to mess up because some frameworks require setting the extra /
at the end
I can reproduce with a simple curl command as well:
curl -v http://localhost:8000/openai//v1/completions \ 130 ↵
-H "Content-Type: application/json" \
-d '{"model": "qwen2-500m-cpu", "prompt": "Who was the first president of the United States?", "max_tokens": 40}'
* Host localhost:8000 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
* Trying [::1]:8000...
* Immediate connect fail for ::1: Network is unreachable
* Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000
> POST /openai//v1/completions HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/8.9.1
> Accept: */*
> Content-Type: application/json
> Content-Length: 108
>
* upload completely sent off: 108 bytes
< HTTP/1.1 301 Moved Permanently
< Location: /openai/v1/completions
< Date: Sun, 29 Sep 2024 21:52:13 GMT
< Content-Length: 0
<
* Connection #0 to host localhost left intact
The error from above indicates a 400 but the curl is mentioning a 301
-L
with curl doesn't work either and returns the 400 error. I will rewrite the integration test to use -L
as well. Good catch all!
Full output:
curl -v -L http://localhost:8000/openai//v1/completions \
-H "Content-Type: application/json" \
-d '{"model": "qwen2-500m-cpu", "prompt": "Who was the first president of the United States?", "max_tokens": 40}'
* Host localhost:8000 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
* Trying [::1]:8000...
* Immediate connect fail for ::1: Network is unreachable
* Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000
> POST /openai//v1/completions HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/8.9.1
> Accept: */*
> Content-Type: application/json
> Content-Length: 108
>
* upload completely sent off: 108 bytes
< HTTP/1.1 301 Moved Permanently
* Need to rewind upload for next request
< Location: /openai/v1/completions
< Date: Mon, 30 Sep 2024 13:49:10 GMT
< Content-Length: 0
* Ignoring the response-body
<
* Connection #0 to host localhost left intact
* Issue another request to this URL: 'http://localhost:8000/openai/v1/completions'
* Switch from POST to GET
* Found bundle for host: 0x61f011443460 [serially]
* Can not multiplex, even if we wanted to
* Re-using existing connection with host localhost
> GET /openai/v1/completions HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/8.9.1
> Accept: */*
> Content-Type: application/json
>
* Request completely sent off
< HTTP/1.1 400 Bad Request
< X-Proxy: lingo
< Date: Mon, 30 Sep 2024 13:49:10 GMT
< Content-Length: 80
< Content-Type: text/plain; charset=utf-8
<
{"error":"unable to parse model: unmarshal json: unexpected end of JSON input"}
The root cause seems to be a POST gets auto redirected to GET when using 301: https://datatracker.ietf.org/doc/html/rfc7231#section-6.4.2
We should use a HTTP 307 or 308 to keep it as a post request.
References:
I think the redirect is done in the http.ServeMux
I see only 301 redirects there. I wonder if the benchmark suite can be fixed instead?
It's an issue with curl as well. The weird thing is that my integration test doesn't reproduce it, but I can very much reproduce the 400 error in my local env using curl.
All clients behave this way when receiving 301 on POST request it seems:
* Issue another request to this URL: 'http://localhost:8000/openai/v1/completions'
* Switch from POST to GET
I was able to reproduce in automated testing as well: https://github.com/substratusai/kubeai/actions/runs/11127386385/job/30919572183?pr=259#step:6:593
KubeAI logs:
Request json:
Note I'm encountering this while running vllm benchmark suite: