Open pkaczynski opened 1 month ago
@pkaczynski Thanks, we're taking a look at this.
The latest docker compose now allows api calls.
Can you let us know if that's working for you again?
Thanks
Nope. Made this v1 change earlier myself but it does not help. And I believe it should be /v1*
Any hints on this? Please note the error, it is from proxy. As far as I debugged the LLM is scored succesfully, there are logs in my ollama, but then the response preaparation fails on bionic side. Am I preparing the request in a wrong way?
I've updated the docker compose to use /v1*
I created a key and used the following to test it
curl -v -N http://localhost:3000/v1/chat/completions \
-H "content-type: application/json" \
-H "authorization: Bearer API_KEY" \
-d '{
"model": "llama3",
"messages": [{"role": "user", "content": "Write a small program in Rust"}],
"temperature": 0.7,
"stream": true
}'
That worked for me with the model that comes out of the box in the docker compose.
Are you using a different model, or can you try the above and perhaps we can start to get more clarity on the issue.
Re-open if issue persists.
I believe I have found the issue. The llm proxy server from the stack is not "compliant" with the response that is not "stream": true
.
That is why this error occurs that I mentioned in the descripion:
InvalidContentType("application/json", Response { url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("gaisandbox.spl.sas.com")), port: None, path: "/llm/v1/chat/completions", query: None, fragment: None }, status: 200, headers: {"content-length": "632", "content-type": "application/json", "date": "Mon, 27 May 2024 17:27:45 GMT"} })
Note, that if I change it to stream: true it works as the content type of the answer is text/event-stream
.
But still, please reopen and fix if possible.
Describe your issue
There are some issues with the API after update. I am uising the 1.7.22 version and calling API which in turn calls LLM gets me the following error:
ERROR web_server::llm_reverse_proxy::sse_chat_enricher: InvalidContentType("application/json", Response { url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("gaisandbox.spl.sas.com")), port: None, path: "/llm/v1/chat/completions", query: None, fragment: None }, status: 200, headers: {"content-length": "632", "content-type": "application/json", "date": "Mon, 27 May 2024 17:27:45 GMT"} })
Seems like response in json is not acceptable? And the content is generated by LLM as you can see content length non zero. Also it seems that the new keycloak replacement does not allow API calls as it returns login screen.
Describe your setup/startup scripts
Generate
Steps to reproduce
What was the expected result?
Not an empty response
Any additional information
No response