ai-dock / comfyui

ComfyUI docker images for use in GPU cloud and local environments. Includes AI-Dock base for authentication and improved user experience.
Other
660 stars 225 forks source link

Comfyui keeps restarting on prompt #58

Closed TechanisYT closed 3 months ago

TechanisYT commented 8 months ago

I just freshly installed ComfyUI with only a CPU on a Ubuntu-Server VM within Proxmox. I had to fiddle around to get it to start with only the cpu but i can't generate any images. If I start the Promt, the server uses all its ram and cpu cores and does not resbond for about 15 minutes. the docker-compose file i use:

version: "3.8"
# Compose file build variables set in .env
services:
  supervisor:
    restart: always
    platform: linux/amd64
    build:
      context: ./build
      args:
        IMAGE_BASE: ${IMAGE_BASE:-ghcr.io/ai-dock/jupyter-pytorch:2.2.0-py3.10-cpu-22.04}
      #tags:
        #- "ghcr.io/ai-dock/comfyui:${IMAGE_TAG:-jupyter-pytorch-2.2.0-py3.10-cpu-22.04}"
        #- "IMAGE_TAG:-jupyter-pytorch-2.2.0-py3.10-cpu-22.04"
    image: ghcr.io/ai-dock/comfyui:${IMAGE_TAG:-jupyter-pytorch-2.2.0-py3.10-cpu-22.04}
    ## For Nvidia GPU's - You probably want to uncomment this
    #deploy:
    #  resources:
    #    reservations:
    #      devices:
    #        - driver: nvidia
    #          count: all
    #          capabilities: [gpu]
    devices:
      - "/dev/dri:/dev/dri"
      ## For AMD GPU
      #- "/dev/kfd:/dev/kfd"
    volumes:
      ## Workspace
      - ./workspace:${WORKSPACE:-/workspace/}:rshared
      # You can share /workspace/storage with other non-ComfyUI containers. See README
      #- /path/to/common_storage:${WORKSPACE:-/workspace/}storage/:rshared
      # Will echo to root-owned authorized_keys file;
      # Avoids changing local file owner
      - ./config/authorized_keys:/root/.ssh/authorized_keys_mount
      - ./config/provisioning/default.sh:/opt/ai-dock/bin/provisioning.sh
    ports:
        # SSH available on host machine port 2222 to avoid conflict. Change to suit
        - ${SSH_PORT_HOST:-2222}:22
        # Caddy port for service portal
        - ${SERVICEPORTAL_PORT_HOST:-1111}:${SERVICEPORTAL_PORT_HOST:-1111}
        # ComfyUI web interface
        - ${COMFYUI_PORT_HOST:-8188}:${COMFYUI_PORT_HOST:-8188}
        # Jupyter server
        - ${JUPYTER_PORT_HOST:-8888}:${JUPYTER_PORT_HOST:-8888}
        # Rclone webserver for interactive configuration
        - ${RCLONE_PORT_HOST:-53682}:${RCLONE_PORT_HOST:-53682}
    environment:
        # Don't enclose values in quotes
        - DIRECT_ADDRESS=${DIRECT_ADDRESS:-192.168.1.101}
        - DIRECT_ADDRESS_GET_WAN=${DIRECT_ADDRESS_GET_WAN:-false}
        - WORKSPACE=${WORKSPACE:-/workspace}
        - WORKSPACE_SYNC=${WORKSPACE_SYNC:-false}
        #- CF_TUNNEL_TOKEN=${CF_TUNNEL_TOKEN:-}
        - CF_QUICK_TUNNELS=${CF_QUICK_TUNNELS:-false}
        - WEB_ENABLE_AUTH=${WEB_ENABLE_AUTH:-true}
        - WEB_USER=${WEB_USER:-user}
        - WEB_PASSWORD=${WEB_PASSWORD:-password}
        - SSH_PORT_HOST=${SSH_PORT_HOST:-2222}
        - SERVICEPORTAL_PORT_HOST=${SERVICEPORTAL_PORT_HOST:-1111}
        - SERVICEPORTAL_METRICS_PORT=${SERVICEPORTAL_METRICS_PORT:-21111}
        - COMFYUI_FLAGS=${COMFYUI_FLAGS:-}
        - COMFYUI_PORT_HOST=${COMFYUI_PORT_HOST:-8188}
        - COMFYUI_METRICS_PORT=${COMFYUI_METRICS_PORT:-28188}
        - JUPYTER_PORT_HOST=${JUPYTER_PORT_HOST:-8888}
        - JUPYTER_METRICS_PORT=${JUPYTER_METRICS_PORT:-28888}
        - SERVERLESS=${SERVERLESS:-false}

the Dockerfile in the build folder:

  GNU nano 6.2                                                                                                                                            Dockerfile                                                                                                                                                      # For build automation - Allows building from any ai-dock base image
# Use a *cuda*base* image as default because pytorch brings the libs
ARG IMAGE_BASE="ghcr.io/ai-dock/jupyter-pytorch:2.2.0-py3.10-cpu-22.04"
FROM ${IMAGE_BASE}

LABEL org.opencontainers.image.source https://github.com/ai-dock/comfyui
LABEL org.opencontainers.image.description "ComfyUI Stable Diffusion backend and GUI"
LABEL maintainer="Rob Ballantyne <rob@dynamedia.uk>"

ENV IMAGE_SLUG="comfyui"
ENV OPT_SYNC=ComfyUI:serverless

# Copy early so we can use scripts in the build - Changes to these files will invalidate the cache and cause a rebuild.
COPY --chown=0:1111 ./COPY_ROOT/ /

# Use build scripts to ensure we can build all targets from one Dockerfile in a single layer.
# Don't put anything heavy in here - We can use multi-stage building above if necessary.

ARG IMAGE_BASE
RUN set -eo pipefail && /opt/ai-dock/bin/build/layer0/init.sh | tee /var/log/build.log

# Must be set after layer0
ENV MAMBA_DEFAULT_ENV=comfyui
ENV MAMBA_DEFAULT_RUN="micromamba run -n $MAMBA_DEFAULT_ENV"

# Copy overrides and models into later layers for fast rebuilds
COPY --chown=0:1111 ./COPY_ROOT_EXTRA/ /
RUN set -eo pipefail && /opt/ai-dock/bin/build/layer1/init.sh | tee -a /var/log/build.log

# Keep init.sh as-is and place additional logic in /opt/ai-dock/bin/preflight.sh
CMD ["init.sh"]

other than that i just cloned the github repo yesterday. the error i am getting:

supervisor_1  | got prompt
supervisor_1  | model_type EPS
supervisor_1  | Using split attention in VAE
supervisor_1  | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
supervisor_1  | Using split attention in VAE
supervisor_1  | 2024-03-10 17:02:44,181 INFO exited: comfyui (exit status 137; not expected)
supervisor_1  | 2024-03-10 17:02:44,632 INFO spawned: 'comfyui' with pid 768
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/caddy.log <==
supervisor_1  | {"level":"error","ts":1710090163.5710387,"logger":"http.log.error","msg":"dial tcp: lookup localhost: i/o timeout","request":{"remote_ip":"192.168.2.101","remote_port":"53571","client_ip":"192.168.2.101","proto":"HTTP/1.1","method":"POST","host":"192.168.1.101:1111","uri":"/ajax/logs","headers":{"Dnt":["1"],"Accept-Encoding":["gzip, deflate"],"Hx-Target":["page"],"Referer":["http://192.168.1.101:1111/"],"Hx-Current-Url":["http://192.168.1.101:1111/"],"Content-Length":["0"],"Content-Type":["application/x-www-form-urlencoded"],"Cookie":[],"Sec-Gpc":["1"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Hx-Request":["true"],"Origin":["http://192.168.1.101:1111"],"Connection":["keep-alive"],"Accept":["*/*"],"Accept-Language":["en-US,en;q=0.7,de-AT;q=0.3"]}},"duration":81.496884408,"status":502,"err_id":"y5ew3hics","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
supervisor_1  | {"level":"error","ts":1710090163.9898791,"logger":"http.log.error","msg":"dial tcp 127.0.0.1:18188: connect: connection refused","request":{"remote_ip":"192.168.2.101","remote_port":"53602","client_ip":"192.168.2.101","proto":"HTTP/1.1","method":"GET","host":"192.168.1.101:8188","uri":"/queue","headers":{"Dnt":["1"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Referer":["http://192.168.1.101:8188/"],"Connection":["keep-alive"],"Cookie":[],"Sec-Gpc":["1"],"Accept":["*/*"],"Accept-Language":["en-US,en;q=0.7,de-AT;q=0.3"],"Accept-Encoding":["gzip, deflate"],"Comfy-User":["undefined"]}},"duration":0.024025992,"status":502,"err_id":"47kp5rvq8","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
supervisor_1  | {"level":"error","ts":1710090163.991298,"logger":"http.log.error","msg":"dial tcp 127.0.0.1:18188: connect: connection refused","request":{"remote_ip":"192.168.2.101","remote_port":"53603","client_ip":"192.168.2.101","proto":"HTTP/1.1","method":"GET","host":"192.168.1.101:8188","uri":"/history?max_items=200","headers":{"Accept-Encoding":["gzip, deflate"],"Comfy-User":["undefined"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Referer":["http://192.168.1.101:8188/"],"Connection":["keep-alive"],"Cookie":[],"Dnt":["1"],"Sec-Gpc":["1"],"Accept":["*/*"],"Accept-Language":["en-US,en;q=0.7,de-AT;q=0.3"]}},"duration":0.000925198,"status":502,"err_id":"pdq6nmh3q","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
supervisor_1  | {"level":"error","ts":1710090164.604226,"logger":"http.log.error","msg":"dial tcp 127.0.0.1:18188: connect: connection refused","request":{"remote_ip":"192.168.2.101","remote_port":"53604","client_ip":"192.168.2.101","proto":"HTTP/1.1","method":"GET","host":"192.168.1.101:8188","uri":"/ws?clientId=5015f3e3d12b455f90d1aa312ca028ac","headers":{"Accept-Language":["en-US,en;q=0.7,de-AT;q=0.3"],"Sec-Websocket-Key":["dwWr+fR6Q71khkaZvhXXSg=="],"Cookie":[],"Cache-Control":["no-cache"],"Upgrade":["websocket"],"Accept":["*/*"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Pragma":["no-cache"],"Sec-Websocket-Version":["13"],"Origin":["http://192.168.1.101:8188"],"Sec-Websocket-Extensions":["permessage-deflate"],"Connection":["keep-alive, Upgrade"],"Accept-Encoding":["gzip, deflate"]}},"duration":0.00243078,"status":502,"err_id":"4ezxns4f4","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/comfyui.log <==
supervisor_1  | Starting ComfyUI...
supervisor_1  | Starting ComfyUI...
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/supervisor.log <==
supervisor_1  | 2024-03-10 17:02:44,181 INFO exited: comfyui (exit status 137; not expected)
supervisor_1  | 2024-03-10 17:02:44,632 INFO spawned: 'comfyui' with pid 768
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/caddy.log <==
supervisor_1  | {"level":"error","ts":1710090165.2220185,"logger":"http.log.error","msg":"dial tcp 127.0.0.1:18188: connect: connection refused","request":{"remote_ip":"192.168.2.101","remote_port":"53607","client_ip":"192.168.2.101","proto":"HTTP/1.1","method":"GET","host":"192.168.1.101:8188","uri":"/ws?clientId=b083dc43d61044c6ab65b32f1cdde643","headers":{"Pragma":["no-cache"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Accept":["*/*"],"Origin":["http://192.168.1.101:8188"],"Cookie":[],"Upgrade":["websocket"],"Sec-Websocket-Version":["13"],"Sec-Websocket-Extensions":["permessage-deflate"],"Accept-Encoding":["gzip, deflate"],"Cache-Control":["no-cache"],"Accept-Language":["en-US,en;q=0.7,de-AT;q=0.3"],"Sec-Websocket-Key":["kNwEmADTHsl/vJ/0d/P9Fg=="],"Connection":["keep-alive, Upgrade"]}},"duration":0.001117775,"status":502,"err_id":"sm5g3mwcy","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/comfyui.log <==
supervisor_1  | ** ComfyUI startup time: 2024-03-10 17:02:45.025688
supervisor_1  | ** Platform: Linux
supervisor_1  | ** Python version: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0]
supervisor_1  | ** Python executable: /opt/micromamba/envs/comfyui/bin/python
supervisor_1  | ** Log path: /workspace/ComfyUI/comfyui.log
supervisor_1  |
supervisor_1  | Prestartup times for custom nodes:
supervisor_1  |    0.1 seconds: /workspace/ComfyUI/custom_nodes/ComfyUI-Manager
supervisor_1  |
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/caddy.log <==
supervisor_1  | {"level":"error","ts":1710090166.064027,"logger":"http.log.error","msg":"dial tcp 127.0.0.1:18188: connect: connection refused","request":{"remote_ip":"192.168.2.101","remote_port":"53608","client_ip":"192.168.2.101","proto":"HTTP/1.1","method":"GET","host":"192.168.1.101:8188","uri":"/ws?clientId=5015f3e3d12b455f90d1aa312ca028ac","headers":{"Sec-Websocket-Version":["13"],"Sec-Websocket-Extensions":["permessage-deflate"],"Sec-Websocket-Key":["R2uKGn4klgocfuTu+fSF+A=="],"Connection":["keep-alive, Upgrade"],"Accept":["*/*"],"Accept-Language":["en-US,en;q=0.7,de-AT;q=0.3"],"Upgrade":["websocket"],"Cookie":[],"Pragma":["no-cache"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Accept-Encoding":["gzip, deflate"],"Cache-Control":["no-cache"],"Origin":["http://192.168.1.101:8188"]}},"duration":0.000765085,"status":502,"err_id":"fkc66rk5b","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
supervisor_1  | {"level":"error","ts":1710090167.3226793,"logger":"http.log.error","msg":"dial tcp 127.0.0.1:18188: connect: connection refused","request":{"remote_ip":"192.168.2.101","remote_port":"53609","client_ip":"192.168.2.101","proto":"HTTP/1.1","method":"GET","host":"192.168.1.101:8188","uri":"/ws?clientId=b083dc43d61044c6ab65b32f1cdde643","headers":{"Sec-Websocket-Key":["/7oTZHXZkskiKv9SMGrrew=="],"Accept":["*/*"],"Accept-Encoding":["gzip, deflate"],"Accept-Language":["en-US,en;q=0.7,de-AT;q=0.3"],"Sec-Websocket-Version":["13"],"Pragma":["no-cache"],"Sec-Websocket-Extensions":["permessage-deflate"],"Connection":["keep-alive, Upgrade"],"Cache-Control":["no-cache"],"Upgrade":["websocket"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Origin":["http://192.168.1.101:8188"],"Cookie":[]}},"duration":0.001707567,"status":502,"err_id":"i6zh30sn6","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/comfyui.log <==
supervisor_1  | Total VRAM 7937 MB, total RAM 7937 MB
supervisor_1  | Set vram state to: DISABLED
supervisor_1  | Device: cpu
supervisor_1  | VAE dtype: torch.float32
supervisor_1  | Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
supervisor_1  | 2024-03-10 17:02:49,846 INFO success: comfyui entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/caddy.log <==
supervisor_1  | {"level":"error","ts":1710090169.2041223,"logger":"http.log.error","msg":"dial tcp 127.0.0.1:18188: connect: connection refused","request":{"remote_ip":"192.168.2.101","remote_port":"53610","client_ip":"192.168.2.101","proto":"HTTP/1.1","method":"GET","host":"192.168.1.101:8188","uri":"/ws?clientId=5015f3e3d12b455f90d1aa312ca028ac","headers":{"Accept-Language":["en-US,en;q=0.7,de-AT;q=0.3"],"Connection":["keep-alive, Upgrade"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0"],"Accept-Encoding":["gzip, deflate"],"Sec-Websocket-Extensions":["permessage-deflate"],"Accept":["*/*"],"Pragma":["no-cache"],"Cache-Control":["no-cache"],"Upgrade":["websocket"],"Sec-Websocket-Version":["13"],"Origin":["http://192.168.1.101:8188"],"Sec-Websocket-Key":["k72s8QRipsT6yoKJQwv9MA=="],"Cookie":[]}},"duration":0.001642951,"status":502,"err_id":"tp6ne2h6p","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/comfyui.log <==
supervisor_1  | ### Loading: ComfyUI-Manager (V2.9)
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/supervisor.log <==
supervisor_1  | 2024-03-10 17:02:49,846 INFO success: comfyui entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
supervisor_1  |
supervisor_1  | ==> /var/log/supervisor/comfyui.log <==
supervisor_1  | ### ComfyUI Revision: 2057 [65397ce6] | Released on '2024-03-10'
supervisor_1  |
supervisor_1  | Import times for custom nodes:
supervisor_1  |    0.1 seconds: /workspace/ComfyUI/custom_nodes/ComfyUI-Manager
supervisor_1  |
supervisor_1  | Starting server
supervisor_1  |
supervisor_1  | To see the GUI go to: http://127.0.0.1:18188
supervisor_1  | [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
supervisor_1  | [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
supervisor_1  | [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
supervisor_1  | [ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json

I probably messed something up with the cpu-only config, but i dont know where. could you please help me and maybe provide better documentation for cpu-only mode?

robballantyne commented 3 months ago

Out of memory, please follow instructions in logfile