Comfy installed but Error occurred when executing CLIPTextEncode

fahadshery commented 3 weeks ago

I have installed comfy using:

The UI loads no issues.

But when run a model it gives this error:

Error occurred when executing CLIPTextEncode:  CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.   File "/stable-diffusion/execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "/stable-diffusion/execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "/stable-diffusion/execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) File "/stable-diffusion/nodes.py", line 58, in encode cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True) File "/stable-diffusion/comfy/sd.py", line 135, in encode_from_tokens self.load_model() File "/stable-diffusion/comfy/sd.py", line 155, in load_model model_management.load_model_gpu(self.patcher) File "/stable-diffusion/comfy/model_management.py", line 467, in load_model_gpu return load_models_gpu([model]) File "/stable-diffusion/comfy/model_management.py", line 461, in load_models_gpu cur_loaded_model = loaded_model.model_load(lowvram_model_memory, force_patch_weights=force_patch_weights) File "/stable-diffusion/comfy/model_management.py", line 305, in model_load raise e File "/stable-diffusion/comfy/model_management.py", line 301, in model_load self.real_model = self.model.patch_model(device_to=patch_model_to, patch_weights=load_weights) File "/stable-diffusion/comfy/model_patcher.py", line 271, in patch_model self.model.to(device_to) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1173, in to return self._apply(convert) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply module._apply(fn) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply module._apply(fn) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply module._apply(fn) [Previous line repeated 2 more times] File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 804, in _apply param_applied = fn(param) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1159, in convert return t.to(

fahadshery commented 3 weeks ago

full logs:

Mounted .cache
mkdir: created directory '/output/comfy'
Mounted comfy
mkdir: created directory '/data/config/comfy/input'
Mounted input
Total VRAM 24576 MB, total RAM 48135 MB
pytorch version: 2.3.0
Set vram state to: NORMAL_VRAM
Device: cuda:0 GRID P40-24Q : cudaMallocAsync
VAE dtype: torch.float32
Using pytorch cross attention
****** User settings have been changed to be stored on the server instead of browser storage. ******
****** For multi-user setups add the --multi-user CLI argument to enable multiple user profiles. ******
Adding extra search path checkpoints /data/models/Stable-diffusion
Adding extra search path configs /data/models/Stable-diffusion
Adding extra search path vae /data/models/VAE
Adding extra search path loras /data/models/Lora
Adding extra search path upscale_models /data/models/RealESRGAN

Adding extra search path hypernetworks /data/models/hypernetworks
Adding extra search path controlnet /data/models/ControlNet
Adding extra search path gligen /data/models/GLIGEN
Adding extra search path clip /data/models/CLIPEncoder
Adding extra search path embeddings /data/embeddings
Adding extra search path custom_nodes /data/config/comfy/custom_nodes

Import times for custom nodes:
   0.0 seconds: /stable-diffusion/custom_nodes/websocket_image_save.py

Starting server

To see the GUI go to: http://0.0.0.0:7860
got prompt
model_type EPS
Using pytorch attention in VAE
Using pytorch attention in VAE
/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
  warnings.warn(
Requested to load SD1ClipModel
Loading 1 new model
!!! Exception during processing!!! CUDA error: operation not supported
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Traceback (most recent call last):
  File "/stable-diffusion/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/stable-diffusion/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/stable-diffusion/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/stable-diffusion/nodes.py", line 58, in encode
    cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True)
  File "/stable-diffusion/comfy/sd.py", line 135, in encode_from_tokens
    self.load_model()
  File "/stable-diffusion/comfy/sd.py", line 155, in load_model
    model_management.load_model_gpu(self.patcher)
  File "/stable-diffusion/comfy/model_management.py", line 467, in load_model_gpu
    return load_models_gpu([model])
  File "/stable-diffusion/comfy/model_management.py", line 461, in load_models_gpu
    cur_loaded_model = loaded_model.model_load(lowvram_model_memory, force_patch_weights=force_patch_weights)
  File "/stable-diffusion/comfy/model_management.py", line 305, in model_load
    raise e
  File "/stable-diffusion/comfy/model_management.py", line 301, in model_load
    self.real_model = self.model.patch_model(device_to=patch_model_to, patch_weights=load_weights)
  File "/stable-diffusion/comfy/model_patcher.py", line 271, in patch_model
    self.model.to(device_to)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1173, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 804, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1159, in convert
    return t.to(
RuntimeError: CUDA error: operation not supported
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Prompt executed in 20.59 seconds
got prompt
Requested to load SD1ClipModel
Loading 1 new model
!!! Exception during processing!!! CUDA error: operation not supported
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Traceback (most recent call last):
  File "/stable-diffusion/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/stable-diffusion/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/stable-diffusion/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/stable-diffusion/nodes.py", line 58, in encode
    cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True)
  File "/stable-diffusion/comfy/sd.py", line 135, in encode_from_tokens
    self.load_model()
  File "/stable-diffusion/comfy/sd.py", line 155, in load_model
    model_management.load_model_gpu(self.patcher)
  File "/stable-diffusion/comfy/model_management.py", line 467, in load_model_gpu
    return load_models_gpu([model])
  File "/stable-diffusion/comfy/model_management.py", line 461, in load_models_gpu
    cur_loaded_model = loaded_model.model_load(lowvram_model_memory, force_patch_weights=force_patch_weights)
  File "/stable-diffusion/comfy/model_management.py", line 305, in model_load
    raise e
  File "/stable-diffusion/comfy/model_management.py", line 301, in model_load
    self.real_model = self.model.patch_model(device_to=patch_model_to, patch_weights=load_weights)
  File "/stable-diffusion/comfy/model_patcher.py", line 271, in patch_model
    self.model.to(device_to)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1173, in to
    return self._apply(convert)
  File "/stable-diffusion/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/stable-diffusion/nodes.py", line 58, in encode
    cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True)
  File "/stable-diffusion/comfy/sd.py", line 135, in encode_from_tokens
    self.load_model()
  File "/stable-diffusion/comfy/sd.py", line 155, in load_model
    model_management.load_model_gpu(self.patcher)
  File "/stable-diffusion/comfy/model_management.py", line 467, in load_model_gpu
    return load_models_gpu([model])
  File "/stable-diffusion/comfy/model_management.py", line 461, in load_models_gpu
    cur_loaded_model = loaded_model.model_load(lowvram_model_memory, force_patch_weights=force_patch_weights)
  File "/stable-diffusion/comfy/model_management.py", line 305, in model_load
    raise e
  File "/stable-diffusion/comfy/model_management.py", line 301, in model_load
    self.real_model = self.model.patch_model(device_to=patch_model_to, patch_weights=load_weights)
  File "/stable-diffusion/comfy/model_patcher.py", line 271, in patch_model
    self.model.to(device_to)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1173, in to
    return self._apply(convert)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply
    module._apply(fn)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 804, in _apply
    param_applied = fn(param)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1159, in convert
    return t.to(
RuntimeError: CUDA error: operation not supported
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Prompt executed in 0.08 seconds

fahadshery commented 3 weeks ago

my docker compose file:


stable-diffusion-base-download:
    build: ./stable-diffusion-webui-docker/services/download/
    image: stable-diffusion-base
    container_name: stable-diffusion-base
    environment:
      - PUID=${PUID:-1000}
      - PGID=${PGID:-1000}
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - ./stable-diffusion-webui-docker/data:/data

  automatic1111-stable-diffusion-webui:
    build: ./stable-diffusion-webui-docker/services/AUTOMATIC1111/
    image: automatic1111
    container_name: automatic1111
    environment:
      - PUID=${PUID:-1000}
      - PGID=${PGID:-1000}
      - CLI_ARGS=--allow-code --medvram --xformers --enable-insecure-extension-access --api
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - ./stable-diffusion-webui-docker/data:/data
      - ./stable-diffusion-webui-docker/output:/output
    stop_signal: SIGKILL
    tty: true
    deploy:
      resources:
        reservations:
          devices:
              - driver: nvidia
                device_ids: ['0']
                capabilities: [compute, utility]
    restart: unless-stopped
    networks:
      - traefik
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.stable-diffusion.rule=Host(`ai-images.local.example.com`)"
      - "traefik.http.routers.stable-diffusion.entrypoints=https"
      - "traefik.http.routers.stable-diffusion.tls=true"
      - "traefik.http.routers.stable-diffusion.tls.certresolver=cloudflare"
      - "traefik.http.services.stable-diffusion.loadbalancer.server.port=7860"
      - "traefik.http.routers.stable-diffusion.middlewares=default-headers@file"

  comfy-webui:
    build: ./stable-diffusion-webui-docker/services/comfy/
    image: comfy-webui
    container_name: comfy-webui
    environment:
      - PUID=${PUID:-1000}
      - PGID=${PGID:-1000}
      - CLI_ARGS=
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
      - ./stable-diffusion-webui-docker/data:/data
      - ./stable-diffusion-webui-docker/output:/output
    stop_signal: SIGKILL
    tty: true
    deploy:
      resources:
        reservations:
          devices:
              - driver: nvidia
                device_ids: ['0']
                capabilities: [compute, utility]
    restart: unless-stopped
    networks:
      - traefik
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.comfy.rule=Host(`comfy.local.example.com`)"
      - "traefik.http.routers.comfy.entrypoints=https"
      - "traefik.http.routers.comfy.tls=true"
      - "traefik.http.routers.comfy.tls.certresolver=cloudflare"
      - "traefik.http.services.comfy.loadbalancer.server.port=7860"
      - "traefik.http.routers.comfy.middlewares=default-headers@file"
      - ```

AbdBarho / stable-diffusion-webui-docker

Comfy installed but Error occurred when executing CLIPTextEncode #726