open-webui / open-webui

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
https://openwebui.com
MIT License
45.66k stars 5.58k forks source link

Local-AI: Empty model selections (when they shouldn't be) #2909

Closed senpro-ingwersenk closed 5 months ago

senpro-ingwersenk commented 5 months ago

Bug Report

Description

Bug Summary: I configured my LocalAI instance as the OpenAI API endpoint; when I ose curl to verify, I see the models just fine:

curl.exe http://192.168.28.174/localai/v1/models
{"object":"list","data":[{"id":"LocalAI-llama3-8b-function-call-v0.2","object":"model"},{"id":"bert-embeddings","object":"model"},{"id":"llama-3-sauerkrautlm-8b-instruct","object":"model"},{"id":"llama3-8b-instruct","object":"model"},{"id":"llava-1.6-vicuna","object":"model"},{"id":"mirai-nova-llama3-LocalAI-8b-v0.1","object":"model"},{"id":"mistral-7b-instruct-v0.3","object":"model"},{"id":"moondream2","object":"model"},{"id":"openvino-multilingual-e5-base","object":"model"},{"id":"openvino-phi3","object":"model"},{"id":"phi-3-mini-4k-instruct","object":"model"},{"id":"whisper-tiny","object":"model"},{"id":"mmproj-vicuna7b-f16.gguf","object":"model"},{"id":"moondream2-mmproj-f16.gguf","object":"model"}]}

However, the same URL (minus /models of course) doesn't result in any models being loaded at all. But it also still reports the connection as successful.

Steps to Reproduce: You may wish to borrow my Caddyfile for proper reproduction:

:80 {
  #handle_path /chroma* {}
  handle_path /localai* {
    reverse_proxy * http://localhost:8080 # Serves LocalAI
  }
  handle /* {
    reverse_proxy * http://localhost:3000 # Serves Open WebUI
  }
}

Launch the Docker containers accordingly. Here is a snippet from my docker-compose.yml:

  localai:
    image: quay.io/go-skynet/local-ai:latest-gpu-nvidia-cuda-12
    restart: always
    ports: [ 0.0.0.0:8080:8080 ]
    volumes:
      - ./models:/build/models
    environment:
      LOCALAI_LOG_LEVEL: "debug"
      LOCALAI_THREADS: "20"
      LOCALAI_SINGLE_ACTIVE_BACKEND: "true"
      REBUILD: "true"
      BUILD_TYPE: "cublas"
      BUILD_PARALLELISM: 24
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

  openwebui:
    #image: ghcr.io/open-webui/open-webui:main
    image: ghcr.io/open-webui/open-webui:ollama
    volumes:
      - ./openwebui:/app/backend/data
      - ./ollama:/root/.ollama
    extra_hosts:
      - "host.docker.internal:host-gateway"
    ports:
      - 0.0.0.0:3000:8080
    environment:
      OPENAI_API_BASE_URL: http://192.168.28.174/localai/v1
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Adjust this and the Caddyfile accordingly, then run. When you open the UI, there will be no available models to chose from: grafik

Expected Behavior: I expected to be able to find and see models to chat with or run RAG.

Actual Behavior: An empty list while the API clearly reports them all.

Environment

Reproduction Details

Confirmation:

Logs and Screenshots

Browser Console Logs: grafik

Docker Container Logs:

openwebui_1  | user f54c42d4-6a26-454c-be87-ce3967387692 disconnected with session ID F2SE_IH1K53atWxoAAAL
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     192.168.30.195:0 - "GET / HTTP/1.1" 304 Not Modified
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/config HTTP/1.1" 200 OK
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     192.168.30.195:0 - "GET /ws/socket.io/?EIO=4&transport=polling&t=O_n-lDW HTTP/1.1" 200 OK
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/v1/auths/ HTTP/1.1" 200 OK
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | connect  NXQo8FR0wgUyiQ36AAAN
openwebui_1  | user Kevin Ingwersen(f54c42d4-6a26-454c-be87-ce3967387692) connected with session ID NXQo8FR0wgUyiQ36AAAN
openwebui_1  | 1
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     ('192.168.30.195', 0) - "WebSocket /ws/socket.io/?EIO=4&transport=websocket&sid=AHnf-DBGXoLh5XabAAAM" [accepted]
openwebui_1  | INFO:     connection open
openwebui_1  | Models in use: []
openwebui_1  | INFO:     192.168.30.195:0 - "POST /ws/socket.io/?EIO=4&transport=polling&t=O_n-lD_&sid=AHnf-DBGXoLh5XabAAAM HTTP/1.1" 200 OK
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     192.168.30.195:0 - "GET /ws/socket.io/?EIO=4&transport=polling&t=O_n-lE0&sid=AHnf-DBGXoLh5XabAAAM HTTP/1.1" 200 OK
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/changelog HTTP/1.1" 200 OK
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/v1/users/user/settings HTTP/1.1" 200 OK
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/v1/prompts/ HTTP/1.1" 200 OK
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/v1/configs/banners HTTP/1.1" 200 OK
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/models HTTP/1.1" 200 OK
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/v1/documents/ HTTP/1.1" 200 OK
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/v1/chats/tags/all HTTP/1.1" 200 OK
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.ollama.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     192.168.30.195:0 - "GET /ollama/api/version HTTP/1.1" 200 OK
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/v1/chats/ HTTP/1.1" 200 OK
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/v1/users/user/settings HTTP/1.1" 200 OK

This happens during the connection test:

openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:config:Saving 'OPENAI_API_BASE_URLS' to config.json
openwebui_1  | INFO:     192.168.30.195:0 - "POST /openai/urls/update HTTP/1.1" 200 OK
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     192.168.30.195:0 - "POST /openai/keys/update HTTP/1.1" 200 OK
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
localai_1    | 9:16AM INF Success ip=172.18.0.1 latency="309.172µs" method=GET status=200 url=/v1/models
openwebui_1  | INFO:     192.168.30.195:0 - "GET /openai/models/0 HTTP/1.1" 200 OK
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:apps.openai.main:get_all_models()
openwebui_1  | INFO:     192.168.30.195:0 - "GET /api/models HTTP/1.1" 200 OK

Screenshots (if applicable): See above.

Installation Method

Docker Compose; the service configurations are above.

Additional Information

I am running this as a temporary setup on an old IBM server with 2x Intel Xeon, 160GB RAM and a RTX 3060. Because of the way the network is built and restricted, I would like to use Caddy to simplify access to the service. Due to the abundance of RAM, it'd be neat to take advantage of it at some point - which is why I use both LocalAI and Ollama; let one manage the GPU and the other the CPU. That is the plan, at least.

Note

If the bug report is incomplete or does not follow the provided instructions, it may not be addressed. Please ensure that you have followed the steps outlined in the README.md and troubleshooting.md documents, and provide all necessary information for us to reproduce and address the issue. Thank you!

senpro-ingwersenk commented 5 months ago

I also tried with the non-ollama embedded model; same result, no models show up in the list.