Closed hicotton02 closed 3 months ago
WEB_ENABLE_AUTH=false
if it's safe to do so is the easiest method.
Alternative is to coordinate WEB_TOKEN
s and pass as Authorisation (Bearer) headers.
This isn't a use case I'd intended for the container but would be interested in hearing about your results
that worked. I am running it internally and do not need web auth. auto1111-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: automatic1111
namespace: ollama
labels:
app: automatic1111
spec:
replicas: 1
selector:
matchLabels:
app: automatic1111
template:
metadata:
labels:
app: automatic1111
spec:
containers:
- name: automatic1111-container
image: ghcr.io/ai-dock/stable-diffusion-webui:latest
ports:
- containerPort: 7860
resources:
limits:
nvidia.com/gpu: 1
memory: "64Gi"
cpu: "12"
requests:
nvidia.com/gpu: 1
memory: "64Gi"
cpu: "12"
env:
- name: WEB_ENABLE_AUTH
value: "false"
- name: PORT
value: "7860"
- name: WEBUI_ARGS
value: "--api --listen"
volumeMounts:
- name: models-volume
mountPath: /opt/stable-diffusion-webui/models
volumes:
- name: models-volume
persistentVolumeClaim:
claimName: automatic1111-models-pvc
nodeSelector:
ollama-gpu: "true"
auto1111-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: automatic1111-service
namespace: ollama
spec:
selector:
app: automatic1111
ports:
- protocol: TCP
port: 7860 # External port you want to use
targetPort: 7860 # Port on the pod where Automatic 1111 is listening
type: LoadBalancer
auto1111-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: automatic1111-models-pvc
namespace: ollama
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi # Adjust based on your needs
I modified the deployment yaml to work with ollama and open-webui
@hicotton02 Thank you Keven, I appreciate you taking the time. I'll play with this config when I get chance and add a documentation section in the base image.
Fwiw, if you need it you can apply the as yet undocumented SUPERVISOR_NO_AUTOSTART=jupyter,syncthing
(any service) to trim the container down if you aren't using the extra parts I need for my single-image cloud deployment
edit: I can reclose this as its not relative to the opening issue, but figured you might have a quick answer.
Thanks for adding that. I am currently having an issue where I get the
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)
A: 1.94 GB, R: 3.74 GB, Sys: 4.1/15.7666 GB (26.0%)
error when trying to generate anything. Im not sure what happened, but in this docker, is there any specifics that could cause this issue?
Not certain as never had the problem but possibly a vram issue. Found some similar issue with potential resolution at https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/2373
I saw this. does commandlineargs translate to webuiargs? so I can throw those into WEBUI_ARGS
Yes 👍
didnt resolve it... hmm...
name: WEBUI_ARGS value: "--api --listen --always-batch-cond-uncond --opt-split-attention --device-id=0"
this worked.
I am using this docker image to deploy into a k8s cluster. When I access the service through the LB (192.168.1.36:7860) I get redirected to http://localhost:1111/login but since I am accessing externally, that doesnt work. I have an ollama implementation and a open-webui implementation in the same k8s cluster that I am trying to connect to. What am I missing?
here is my deployment yaml:
I am setting the serviceportal ip manually in the deployment file, but when the pod starts:
I have also tried manually editing the caddy/share/config files to try and push the config change to no avail.