Closed redno2 closed 2 months ago
I ran into a similar issues last week when working on a Minikube cluster without any internet access. The workaround that fixed it for me was to set extraEnvVars.RAG_EMBEDDING_MODEL_AUTO_UPDATE
to False
.
It's worth noting that according to the docs, that value should be the default in Open WebUI: https://docs.openwebui.com/getting-started/env-configuration#rag_embedding_model_auto_update
So we need to dig into why the pod isn't ending up with that value. Let me know if that fix works for you too, and we can look at adding an input value for offline installations that automatically sets that, or figuring out why/ where that value is being overwritten.
I've already tried setting this variable in the deployment but it doesn't change the behavior on my side. Tested with the lates,dev, fixed tag versions.
My values-test.yaml
open-webui:
nameOverride: ""
ollama:
# -- Automatically install Ollama Helm chart from https://otwld.github.io/ollama-helm/. Use [Helm Values](https://github.com/otwld/ollama-helm/#helm-values) to configure
enabled: true
# -- If enabling embedded Ollama, update fullnameOverride to your desired Ollama name value, or else it will use the default ollama.name value from the Ollama chart
#fullnameOverride: "open-webui-ollama"
# -- Example Ollama configuration with nvidia GPU enabled, automatically downloading a model, and deploying a PVC for model persistence
# ollama:
# gpu:
# enabled: true
# type: 'nvidia'
# number: 1
# models:
# - llama3
# runtimeClassName: nvidia
# persistentVolume:
# enabled: true
image:
repository: repo.local/upstream/docker.io/ollama/ollama
tag: '0.1.43'
pipelines:
# -- Automatically install Pipelines chart to extend Open WebUI functionality using Pipelines: https://github.com/open-webui/pipelines
enabled: true
extraEnvVars:
# -- This is a default password that can and should be updated on your production deployment, and should be stored in a K8s secret
- name: PIPELINES_API_KEY
value: "0p3n-w3bu!"
# valueFrom:
# secretKeyRef:
# name: pipelines-api-key
# key: api-key
image:
repository: repo.local/upstream/ghcr.io/open-webui/pipelines
# -- A list of Ollama API endpoints. These can be added in lieu of automatically installing the Ollama Helm chart, or in addition to it.
ollamaUrls: []
# -- Value of cluster domain
clusterDomain: localhost.localdomain
annotations: {}
podAnnotations: {}
replicaCount: 1
# -- Open WebUI image tags can be found here: https://github.com/open-webui/open-webui/pkgs/container/open-webui
image:
repository: repo.local/upstream/ghcr.io/open-webui/open-webui
tag: "0.3.4"
# tag: "dev"
pullPolicy: Always
resources: {}
ingress:
enabled: true
class: ""
# -- Use appropriate annotations for your Ingress controller, e.g., for NGINX:
# nginx.ingress.kubernetes.io/rewrite-target: /
annotations: {}
host: "open-ui.localhost.localdomain"
tls: false
existingSecret: ""
persistence:
enabled: true
size: 2Gi
# -- Use existingClaim if you want to re-use an existing Open WebUI PVC instead of creating a new one
existingClaim: ""
# -- If using multiple replicas, you must update accessModes to ReadWriteMany
accessModes:
- ReadWriteOnce
storageClass: ""
selector: {}
annotations: {}
# -- Node labels for pod assignment.
nodeSelector: {}
# -- Tolerations for pod assignment
tolerations: []
# -- Affinity for pod assignment
affinity: {}
# -- Service values to expose Open WebUI pods to cluster
service:
type: ClusterIP
annotations: {}
port: 80
containerPort: 8080
nodePort: ""
labels: {}
loadBalancerClass: ""
# -- OpenAI base API URL to use. Defaults to the Pipelines service endpoint when Pipelines are enabled, and "https://api.openai.com/v1" if Pipelines are not enabled and this value is blank
openaiBaseApiUrl: ""
# -- Additional environments variables on the output Deployment definition. Most up-to-date environment variables can be found here: https://docs.openwebui.com/getting-started/env-configuration/
extraEnvVars:
# - name: OPENAI_API_KEY
# valueFrom:
# secretKeyRef:
# name: openai-api-key
# key: api-key
# - name: OLLAMA_DEBUG
# value: "1"
- name: RAG_EMBEDDING_MODEL_AUTO_UPDATE
value: "False"
- name: LITELLM_LOCAL_MODEL_COST_MAP
value: "True"
# - name: RAG_EMBEDDING_MODEL
# value: "/app/backend/data/cache/embedding/models/models--sentence-transformers--all-MiniLM-L6-v2/snapshots/e4ce9877abf3edfe10b0d82785e83bdcb973e22e"
#ollama:
# image:
# repository: repo.local/upstream/docker.io/ollama/ollama
# tag: '0.1.41'
#
#pipelines:
# image:
# repository: repo.local/upstream/ghcr.io/open-webui/pipelines
open-webui:
nameOverride: ""
ollama:
# -- Automatically install Ollama Helm chart from https://otwld.github.io/ollama-helm/. Use [Helm Values](https://github.com/otwld/ollama-helm/#helm-values) to configure
enabled: true
# -- If enabling embedded Ollama, update fullnameOverride to your desired Ollama name value, or else it will use the default ollama.name value from the Ollama chart
#fullnameOverride: "open-webui-ollama"
# -- Example Ollama configuration with nvidia GPU enabled, automatically downloading a model, and deploying a PVC for model persistence
# ollama:
# gpu:
# enabled: true
# type: 'nvidia'
# number: 1
# models:
# - llama3
# runtimeClassName: nvidia
# persistentVolume:
# enabled: true
image:
repository: repo.local/upstream/docker.io/ollama/ollama
tag: '0.1.43'
pipelines:
# -- Automatically install Pipelines chart to extend Open WebUI functionality using Pipelines: https://github.com/open-webui/pipelines
enabled: true
extraEnvVars:
# -- This is a default password that can and should be updated on your production deployment, and should be stored in a K8s secret
- name: PIPELINES_API_KEY
value: "0p3n-w3bu!"
# valueFrom:
# secretKeyRef:
# name: pipelines-api-key
# key: api-key
image:
repository: repo.local/upstream/ghcr.io/open-webui/pipelines
# -- A list of Ollama API endpoints. These can be added in lieu of automatically installing the Ollama Helm chart, or in addition to it.
ollamaUrls: []
# -- Value of cluster domain
clusterDomain: localhost.localdomain
annotations: {}
podAnnotations: {}
replicaCount: 1
# -- Open WebUI image tags can be found here: https://github.com/open-webui/open-webui/pkgs/container/open-webui
image:
repository: repo.local/upstream/ghcr.io/open-webui/open-webui
tag: "0.3.4"
# tag: "dev"
pullPolicy: Always
resources: {}
ingress:
enabled: true
class: ""
# -- Use appropriate annotations for your Ingress controller, e.g., for NGINX:
# nginx.ingress.kubernetes.io/rewrite-target: /
annotations: {}
host: "open-ui.localhost.localdomain"
tls: false
existingSecret: ""
persistence:
enabled: true
size: 2Gi
# -- Use existingClaim if you want to re-use an existing Open WebUI PVC instead of creating a new one
existingClaim: ""
# -- If using multiple replicas, you must update accessModes to ReadWriteMany
accessModes:
- ReadWriteOnce
storageClass: ""
selector: {}
annotations: {}
# -- Node labels for pod assignment.
nodeSelector: {}
# -- Tolerations for pod assignment
tolerations: []
# -- Affinity for pod assignment
affinity: {}
# -- Service values to expose Open WebUI pods to cluster
service:
type: ClusterIP
annotations: {}
port: 80
containerPort: 8080
nodePort: ""
labels: {}
loadBalancerClass: ""
# -- OpenAI base API URL to use. Defaults to the Pipelines service endpoint when Pipelines are enabled, and "https://api.openai.com/v1" if Pipelines are not enabled and this value is blank
openaiBaseApiUrl: ""
# -- Additional environments variables on the output Deployment definition. Most up-to-date environment variables can be found here: https://docs.openwebui.com/getting-started/env-configuration/
extraEnvVars:
# - name: OPENAI_API_KEY
# valueFrom:
# secretKeyRef:
# name: openai-api-key
# key: api-key
# - name: OLLAMA_DEBUG
# value: "1"
- name: RAG_EMBEDDING_MODEL_AUTO_UPDATE
value: "False"
- name: LITELLM_LOCAL_MODEL_COST_MAP
value: "True"
# - name: RAG_EMBEDDING_MODEL
# value: "/app/backend/data/cache/embedding/models/models--sentence-transformers--all-MiniLM-L6-v2/snapshots/e4ce9877abf3edfe10b0d82785e83bdcb973e22e"
The "all" status
$ kc get all -A
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx pod/ingress-nginx-admission-create-xbvqj 0/1 Completed 0 5m30s
ingress-nginx pod/ingress-nginx-admission-patch-wq9j6 0/1 Completed 0 5m30s
ingress-nginx pod/ingress-nginx-controller-774bbd759-fksfd 1/1 Running 0 5m30s
kube-system pod/coredns-5dd5756b68-24jff 1/1 Running 0 5m33s
kube-system pod/coredns-5dd5756b68-b46zf 1/1 Running 0 5m33s
kube-system pod/etcd-open-webui-control-plane 1/1 Running 0 5m46s
kube-system pod/kindnet-m2sn7 1/1 Running 0 5m34s
kube-system pod/kube-apiserver-open-webui-control-plane 1/1 Running 0 5m48s
kube-system pod/kube-controller-manager-open-webui-control-plane 1/1 Running 0 5m46s
kube-system pod/kube-proxy-qr2tx 1/1 Running 0 5m34s
kube-system pod/kube-scheduler-open-webui-control-plane 1/1 Running 0 5m46s
local-path-storage pod/local-path-provisioner-7577fdbbfb-ptnzv 1/1 Running 0 5m33s
qrs-playground pod/open-webui-568bd98cdd-67cqp 0/1 CrashLoopBackOff 4 (19s ago) 3m15s
qrs-playground pod/open-webui-ollama-54679c7d95-j5w9s 1/1 Running 0 3m15s
qrs-playground pod/open-webui-pipelines-58ff76867f-5w4qs 1/1 Running 0 3m15s
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5m48s
ingress-nginx service/ingress-nginx-controller NodePort 10.96.244.111 <none> 80:30186/TCP,443:31624/TCP 5m30s
ingress-nginx service/ingress-nginx-controller-admission ClusterIP 10.96.18.231 <none> 443/TCP 5m30s
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 5m46s
qrs-playground service/open-webui ClusterIP 10.96.73.175 <none> 80/TCP 3m15s
qrs-playground service/open-webui-ollama ClusterIP 10.96.8.214 <none> 11434/TCP 3m15s
qrs-playground service/open-webui-pipelines ClusterIP 10.96.59.41 <none> 9099/TCP 3m15s
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/kindnet 1 1 1 1 1 kubernetes.io/os=linux 5m45s
kube-system daemonset.apps/kube-proxy 1 1 1 1 1 kubernetes.io/os=linux 5m46s
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
ingress-nginx deployment.apps/ingress-nginx-controller 1/1 1 1 5m30s
kube-system deployment.apps/coredns 2/2 2 2 5m46s
local-path-storage deployment.apps/local-path-provisioner 1/1 1 1 5m45s
qrs-playground deployment.apps/open-webui 0/1 1 0 3m15s
qrs-playground deployment.apps/open-webui-ollama 1/1 1 1 3m15s
qrs-playground deployment.apps/open-webui-pipelines 1/1 1 1 3m15s
NAMESPACE NAME DESIRED CURRENT READY AGE
ingress-nginx replicaset.apps/ingress-nginx-controller-774bbd759 1 1 1 5m30s
kube-system replicaset.apps/coredns-5dd5756b68 2 2 2 5m33s
local-path-storage replicaset.apps/local-path-provisioner-7577fdbbfb 1 1 1 5m33s
qrs-playground replicaset.apps/open-webui-568bd98cdd 1 1 0 3m15s
qrs-playground replicaset.apps/open-webui-ollama-54679c7d95 1 1 1 3m15s
qrs-playground replicaset.apps/open-webui-pipelines-58ff76867f 1 1 1 3m15s
NAMESPACE NAME COMPLETIONS DURATION AGE
ingress-nginx job.batch/ingress-nginx-admission-create 1/1 5s 5m30s
ingress-nginx job.batch/ingress-nginx-admission-patch 1/1 6s 5m30s
The var is well set in the pod, but doesn't change,
$ kc exec -ti -n qrs-playground open-webui-568bd98cdd-67cqp -- sh -c "env" |grep RAG
RAG_EMBEDDING_MODEL_AUTO_UPDATE=False
same with the latest version 3.0.4
I found a solution, mainly due to network reasons can not be embedded in the current models--sentence-transformers--all-MiniLM-L6-v2 model, can be added in extraEnvVars: -name: HF_ENDPOINT. Value: "https://hf-mirror.com" @redno2
Not trying to ignore this issue, I have a lot going on right now. As soon as I can find the time to dig deeper into this, I'll push a fix.
I found a solution, mainly due to network reasons can not be embedded in the current models--sentence-transformers--all-MiniLM-L6-v2 model, can be added in extraEnvVars: -name: HF_ENDPOINT. Value: "https://hf-mirror.com" @redno2
Thanks, but it won't solve the problem because I'm in offline mode.
Any suggestions on this, facing same issue with offline install..
` k logs -f open-webui-0 -n open-webui Loading WEBUI_SECRET_KEY from file, not provided as an environment variable. Generating WEBUI_SECRET_KEY Loading WEBUI_SECRET_KEY from .webui_secret_key /app USER_AGENT environment variable not set, consider setting it to identify your requests. Cannot determine model snapshot path: Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online, pass 'local_files_only=False' as input. Traceback (most recent call last): File "/app/backend/apps/rag/utils.py", line 353, in get_model_path model_repo_path = snapshot_download(*snapshot_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py", line 220, in snapshot_download raise LocalEntryNotFoundError( huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online, pass 'local_files_only=False' as input. No sentence-transformers model found with name sentence-transformers/all-MiniLM-L6-v2. Creating a new one with mean pooling. Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 196, in _new_conn sock = connection.create_connection( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection raise err File "/usr/local/lib/python3.11/site-packages/urllib3/util/connection.py", line 73, in create_connection sock.connect(sa) OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 490, in _make_request raise new_e File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 466, in _make_request self._validate_conn(conn) File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn conn.connect() File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 615, in connect self.sock = sock = self._new_conn() ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 211, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f411159f310>: Failed to establish a new connection: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 667, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 843, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/util/retry.py", line 519, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /sentence-transformers/all-MiniLM-L6-v2/resolve/main/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f411159f310>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1722, in _get_metadata_or_catch_error metadata = get_hf_file_metadata(url=url, proxies=proxies, timeout=etag_timeout, headers=headers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(args, kwargs) ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata r = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper response = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 395, in _request_wrapper response = get_session().request(method=method, url=url, params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 66, in send return super().send(request, args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 700, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /sentence-transformers/all-MiniLM-L6-v2/resolve/main/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f411159f310>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 9f548bfb-7d60-465b-8f6e-317111ce21d8)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/transformers/utils/hub.py", line 402, in cached_file resolved_file = hf_hub_download( ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1221, in hf_hub_download return _hf_hub_download_to_cache_dir( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1325, in _hf_hub_download_to_cache_dir _raise_on_head_call_error(head_call_error, force_download, local_files_only) File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1826, in _raise_on_head_call_error raise LocalEntryNotFoundError( huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/uvicorn", line 8, in
I'm working with the maintainer to see if we can address this in Open WebUI itself since the issue originates from the model attempting to install on Open WebUI startup. I'll keep you all updated, thanks for your patience.
I got some good input from another maintainer:
The default embedding model is already preloaded, it just doesn't quite work with how Kubernetes and Docker volume mount semantics differ. On Docker, if an empty volume mounts a directory that already exists in the image, it makes a copy of the contents of the image into the mount, which is where the cache is located. On Kubernetes, this doesn't happen, so mounting /app/backend/data wipes anything that's precached.
Upon researching this further, it seems the most viable option is to create an init container that will copy the existing contents of /app/data/backend
(the location of the Kubernetes volume) from the Docker image to the resulting volume to ensure that the faulty model is present, and that the HuggingFace download is not attempted since the model is present locally.
I'm going to spend some time on this now, will see if I can get a working solution.
I think I got this fixed here: https://github.com/open-webui/helm-charts/pull/53
I turned my internet off on my test machine to ensure that I replicated the offline environment that's been reported for this issue, and was able to get Open WebUI to start without the HuggingFace errors since the model is now correctly copied into the pod using the init container.
@redno2 @srik65 please try the offline install again with Chart v3.1.0 and let me know if that solves the issue for you, or what new issues you run into.
I think I got this fixed here: #53
I turned my internet off on my test machine to ensure that I replicated the offline environment that's been reported for this issue, and was able to get Open WebUI to start without the HuggingFace errors since the model is now correctly copied into the pod using the init container.
@redno2 @srik65 please try the offline install again with Chart v3.1.0 and let me know if that solves the issue for you, or what new issues you run into.
Yes, it works (tested with the 3.1.1) ! Thanks for the fix :+1:
Bug Report
Description
Bug Summary: Try to deploy the helm chart and kubernetes manifests, both fail in offline setup.
"Cannot determine model snapshot path: Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online, pass 'local_files_only=False' as input."
Try to tests without success: https://github.com/open-webui/open-webui/discussions/918 https://github.com/open-webui/open-webui/pull/1419
Steps to Reproduce: Deploy in offline mode
Expected Behavior: Work without internet connection
Actual Behavior: Pod CrashLoopBackOff
Environment
Open WebUI Version: 0.3.4 and dev
Ollama (if applicable): 0.1.43
Operating System: Ubuntu 22.04
Reproduction Details
Confirmation:
Logs and Screenshots
Docker Container Logs:
Installation Method
Both method here : https://docs.openwebui.com/getting-started/installation
Installed to a local kind cluster
Additional Information