Unable to run the toolkit with only local LLMs

ttthy commented 1 month ago

I've followed the tutorial from sermas-eu.github.io to configure all API calls forwarding to local Ollama LLMs. However, the toolkit still not be able to call local models as shown in the API log:

Here are the steps for reproducing this:

Pull ollama models as below: ollama pull phi3 ollama pull nomic-embed-text
Verify ollama server is running: curl http://127.0.0.1:11434/api/generate -d '{"model": "phi3", "prompt": "Why is the sky blue?", "stream": false}' curl http://127.0.0.1:11434/api/embed -d '{"model": "nomic-embed-text", "input": "Why is the sky blue?"}'
Create and update the file "sermas-toolkit-api/data/api/.env"
Restarted docker with docker compose restart api
Update settings.yaml in sermas-toolkit-api/apps/myapp/settings.yaml
Save app sermas-cli app save /apps/myapp
Try to chat with the app: alias "sermas-cli=docker compose run --rm -it cli" sermas-cli app chat [App-ID]
No response but got only the repeated input
Got the error as below and shown in the first screenshot api-1 | [Nest] 1 - 09/19/2024, 10:04:48 AM ERROR [OllamaEmbeddingProvider] Failed to generate embeddings: fetch failed api-1 | [Nest] 1 - 09/19/2024, 10:04:49 AM ERROR [LLMProviderService] Provider openai error: 401 You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.

Do you have any idea to fix this? are there any missing steps for configuration?

This was run on: Apple M2 OS: MacOS 14.6.1 (23G93) Docker version 27.2.0, build 3ab4256 git rev-parse HEAD: 8982e71d6e0714a353091d08aea8d73e2220f7b0

Kanthavel-Spindox commented 1 month ago

Hi @ttthy !

It seems that sermas-api is not able to reach the ollama server at 127.0.0.1:11434 From my tests and analysis the problem may have two (possibly concurrent) causes:

1- The docker container of sermas-api is not able to properly resolve 127.0.0.1. On MacOS, this is solved by replacing the ollama url host with host.docker.internal:

OLLAMA_URL=http://host.docker.internal:11434

(reference here)

2- Ollama only listens to host localhost and to IP 127.0.0.1. This is the default behaviour when installing ollama as a normal service (not a docker container) following the standard installation procedure. But, as mentioned above, the docker container of sermas-api will not be able to reach any of these two hosts. To change this, you must set the OLLAMA_HOST variable, like this: launchctl setenv OLLAMA_HOST "0.0.0.0"

Then restart the ollama service.

(reference here)

Please let me know if this helps. I will correct the documentation accordingly.

Thanks,

Kanthavel

Kanthavel-Spindox commented 1 month ago

Hello @ttthy ,

I have found another bug that may be causing the problem you describe. When using the chat, sermas-api will try to use the default chat service provider and model, which is openai/gpt-4o, even if you set LLM_SERVICE=ollama.

I will look more deeply into this, and I am considering reducing and renaming the variables in the .env. Meanwhile, a quick workaround is to set the chat service provider and model. Please add the following line to your .env: LLM_SERVICE_CHAT=ollama/phi3

Please let me know the outcome.

Cheers,

Kanthavel

sermas-eu / sermas-toolkit-api

Unable to run the toolkit with only local LLMs #5