bionic-gpt / bionic-gpt

BionicGPT is an on-premise replacement for ChatGPT, offering the advantages of Generative AI while maintaining strict data confidentiality
https://bionic-gpt.com
Apache License 2.0
1.75k stars 165 forks source link

Issues with API #500

Open pkaczynski opened 1 month ago

pkaczynski commented 1 month ago

Describe your issue

There are some issues with the API after update. I am uising the 1.7.22 version and calling API which in turn calls LLM gets me the following error:

ERROR web_server::llm_reverse_proxy::sse_chat_enricher: InvalidContentType("application/json", Response { url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("gaisandbox.spl.sas.com")), port: None, path: "/llm/v1/chat/completions", query: None, fragment: None }, status: 200, headers: {"content-length": "632", "content-type": "application/json", "date": "Mon, 27 May 2024 17:27:45 GMT"} })

Seems like response in json is not acceptable? And the content is generated by LLM as you can see content length non zero. Also it seems that the new keycloak replacement does not allow API calls as it returns login screen.

Describe your setup/startup scripts

Generate

Steps to reproduce

  1. Expose API with a key
  2. Execute 'http://bionic/v1/chat/completions' with appropriate key, example below in python
    query = {
    'model': 'llama2',
    'messages': [{
        'role': 'user',
        'content': 'Why is the sky blue?'
    }],
    }
    headers = {
    "Content-Type": "application/json",
    "Accept": "application/json",
    "Authorization": "Bearer " + api_key
    }
    response = requests.post(
    api_url, data=json.dumps(query), headers=headers
    )

What was the expected result?

Not an empty response

Any additional information

No response

9876691 commented 1 month ago

@pkaczynski Thanks, we're taking a look at this.

9876691 commented 1 month ago

The latest docker compose now allows api calls.

Can you let us know if that's working for you again?

Thanks

pkaczynski commented 1 month ago

Nope. Made this v1 change earlier myself but it does not help. And I believe it should be /v1*

pkaczynski commented 1 month ago

Any hints on this? Please note the error, it is from proxy. As far as I debugged the LLM is scored succesfully, there are logs in my ollama, but then the response preaparation fails on bionic side. Am I preparing the request in a wrong way?

9876691 commented 1 month ago

I've updated the docker compose to use /v1*

I created a key and used the following to test it

curl -v -N http://localhost:3000/v1/chat/completions \
  -H "content-type: application/json" \
  -H "authorization: Bearer API_KEY" \
  -d '{
     "model": "llama3",
     "messages": [{"role": "user", "content": "Write a small program in Rust"}],
     "temperature": 0.7,
     "stream": true
   }'

That worked for me with the model that comes out of the box in the docker compose.

Are you using a different model, or can you try the above and perhaps we can start to get more clarity on the issue.

9876691 commented 1 month ago

Re-open if issue persists.

pkaczynski commented 3 weeks ago

I believe I have found the issue. The llm proxy server from the stack is not "compliant" with the response that is not "stream": true.

That is why this error occurs that I mentioned in the descripion:

InvalidContentType("application/json", Response { url: Url { scheme: "http", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("gaisandbox.spl.sas.com")), port: None, path: "/llm/v1/chat/completions", query: None, fragment: None }, status: 200, headers: {"content-length": "632", "content-type": "application/json", "date": "Mon, 27 May 2024 17:27:45 GMT"} })

Note, that if I change it to stream: true it works as the content type of the answer is text/event-stream.

But still, please reopen and fix if possible.