comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
https://www.comfy.org/
GNU General Public License v3.0
50.23k stars 5.29k forks source link

Stable Cascade produces blank results #2907

Open TurningTide opened 6 months ago

TurningTide commented 6 months ago

Right from example:

image

No errors in console, just single warning clip missing: ['clip_g.logit_scale'].

ComfyUI latest version. Windows 11, Torch 2.1.2, CUDA 12.1, RTX 4x GPU.

Guillaume-Fgt commented 6 months ago

I don't see any obvious problem in your node settings at first glance. It should work. Your stage_c result is not normal.

One thing that come to my mind, which Python version are you running? I ask because I had problems using python3.12 with pytorch some time ago:

Currently, PyTorch on Windows only supports Python 3.8-3.11; Python 2.x is not supported.

Please copy paste all your comfyui terminal from a fresh start, to see all the outputs.

comfyanonymous commented 6 months ago

Try downloading a fresh standalone package and try it with that.

TurningTide commented 6 months ago

One thing that come to my mind, which Python version are you running?

3.11.8. SD1.5 and SDXL work fine.

Please copy paste all your comfyui terminal from a fresh start, to see all the outputs.


E:\_AI\ComfyUI>python.exe -s main.py
Total VRAM 24563 MB, total RAM 49084 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4090 : cudaMallocAsync
VAE dtype: torch.bfloat16
Using pytorch cross attention
Starting server

To see the GUI go to: http://127.0.0.1:8188 got prompt model_type STABLE_CASCADE adm 0 Missing VAE keys ['encoder.mean', 'encoder.std'] clip missing: ['clip_g.logit_scale'] left over keys: dict_keys(['clip_l_vision.vision_model.embeddings.class_embedding', 'clip_l_vision.vision_model.embeddings.patch_embedding.weight', 'clip_l_vision.vision_model.embeddings.position_embedding.weight', 'clip_l_vision.vision_model.embeddings.position_ids', 'clip_l_vision.vision_model.encoder.layers.0.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.0.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.0.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.0.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.0.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.0.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.0.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.0.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.1.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.1.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.1.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.1.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.1.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.1.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.1.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.1.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.10.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.10.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.10.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.10.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.10.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.10.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.10.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.10.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.11.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.11.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.11.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.11.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.11.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.11.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.11.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.11.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.12.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.12.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.12.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.12.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.12.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.12.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.12.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.12.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.13.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.13.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.13.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.13.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.13.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.13.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.13.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.13.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.14.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.14.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.14.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.14.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.14.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.14.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.14.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.14.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.15.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.15.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.15.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.15.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.15.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.15.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.15.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.15.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.16.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.16.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.16.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.16.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.16.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.16.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.16.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.16.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.17.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.17.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.17.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.17.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.17.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.17.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.17.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.17.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.18.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.18.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.18.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.18.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.18.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.18.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.18.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.18.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.19.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.19.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.19.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.19.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.19.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.19.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.19.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.19.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.2.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.2.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.2.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.2.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.2.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.2.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.2.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.2.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.20.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.20.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.20.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.20.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.20.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.20.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.20.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.20.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.21.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.21.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.21.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.21.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.21.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.21.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.21.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.21.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.22.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.22.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.22.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.22.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.22.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.22.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.22.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.22.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.23.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.23.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.23.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.23.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.23.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.23.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.23.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.23.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.3.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.3.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.3.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.3.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.3.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.3.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.3.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.3.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.4.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.4.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.4.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.4.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.4.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.4.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.4.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.4.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.5.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.5.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.5.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.5.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.5.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.5.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.5.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.5.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.6.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.6.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.6.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.6.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.6.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.6.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.6.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.6.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.7.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.7.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.7.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.7.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.7.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.7.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.7.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.7.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.8.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.8.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.8.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.8.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.8.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.8.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.8.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.8.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.9.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.9.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.9.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.9.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.9.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.9.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.9.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.9.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'clip_l_vision.vision_model.post_layernorm.bias', 'clip_l_vision.vision_model.post_layernorm.weight', 'clip_l_vision.vision_model.pre_layrnorm.bias', 'clip_l_vision.vision_model.pre_layrnorm.weight', 'clip_l_vision.visual_projection.weight']) Requested to load StableCascadeClipModel Loading 1 new model Requested to load StableCascade_C Loading 1 new model 100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:02<00:00, 7.52it/s] Requested to load StageC_coder Loading 1 new model model_type STABLE_CASCADE adm 0 clip missing: ['clip_g.logit_scale'] Requested to load StableCascade_B Loading 1 new model 100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:02<00:00, 3.76it/s] Requested to load StageA Loading 1 new model Prompt executed in 29.48 seconds


> Try downloading a fresh standalone package and try it with that.

Same results. Here's console output:

E:_AI\ComfyUI_standalone>.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build Total VRAM 24563 MB, total RAM 49084 MB Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 4090 : cudaMallocAsync VAE dtype: torch.bfloat16 Using pytorch cross attention ** User settings have been changed to be stored on the server instead of browser storage. ** ** For multi-user setups add the --multi-user CLI argument to enable multiple user profiles. ** Starting server

To see the GUI go to: http://127.0.0.1:8188 got prompt model_type STABLE_CASCADE adm 0 Missing VAE keys ['encoder.mean', 'encoder.std'] clip missing: ['clip_g.logit_scale'] left over keys: dict_keys(['clip_l_vision.vision_model.embeddings.class_embedding', 'clip_l_vision.vision_model.embeddings.patch_embedding.weight', 'clip_l_vision.vision_model.embeddings.position_embedding.weight', 'clip_l_vision.vision_model.embeddings.position_ids', 'clip_l_vision.vision_model.encoder.layers.0.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.0.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.0.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.0.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.0.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.0.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.0.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.0.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.1.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.1.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.1.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.1.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.1.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.1.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.1.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.1.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.10.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.10.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.10.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.10.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.10.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.10.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.10.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.10.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.11.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.11.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.11.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.11.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.11.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.11.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.11.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.11.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.12.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.12.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.12.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.12.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.12.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.12.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.12.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.12.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.13.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.13.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.13.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.13.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.13.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.13.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.13.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.13.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.14.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.14.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.14.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.14.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.14.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.14.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.14.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.14.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.15.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.15.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.15.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.15.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.15.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.15.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.15.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.15.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.16.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.16.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.16.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.16.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.16.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.16.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.16.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.16.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.17.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.17.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.17.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.17.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.17.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.17.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.17.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.17.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.18.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.18.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.18.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.18.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.18.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.18.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.18.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.18.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.19.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.19.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.19.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.19.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.19.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.19.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.19.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.19.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.2.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.2.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.2.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.2.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.2.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.2.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.2.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.2.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.20.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.20.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.20.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.20.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.20.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.20.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.20.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.20.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.21.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.21.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.21.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.21.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.21.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.21.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.21.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.21.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.22.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.22.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.22.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.22.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.22.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.22.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.22.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.22.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.23.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.23.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.23.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.23.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.23.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.23.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.23.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.23.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.3.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.3.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.3.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.3.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.3.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.3.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.3.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.3.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.4.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.4.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.4.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.4.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.4.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.4.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.4.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.4.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.5.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.5.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.5.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.5.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.5.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.5.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.5.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.5.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.6.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.6.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.6.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.6.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.6.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.6.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.6.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.6.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.7.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.7.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.7.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.7.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.7.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.7.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.7.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.7.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.8.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.8.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.8.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.8.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.8.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.8.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.8.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.8.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'clip_l_vision.vision_model.encoder.layers.9.layer_norm1.bias', 'clip_l_vision.vision_model.encoder.layers.9.layer_norm1.weight', 'clip_l_vision.vision_model.encoder.layers.9.layer_norm2.bias', 'clip_l_vision.vision_model.encoder.layers.9.layer_norm2.weight', 'clip_l_vision.vision_model.encoder.layers.9.mlp.fc1.bias', 'clip_l_vision.vision_model.encoder.layers.9.mlp.fc1.weight', 'clip_l_vision.vision_model.encoder.layers.9.mlp.fc2.bias', 'clip_l_vision.vision_model.encoder.layers.9.mlp.fc2.weight', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'clip_l_vision.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'clip_l_vision.vision_model.post_layernorm.bias', 'clip_l_vision.vision_model.post_layernorm.weight', 'clip_l_vision.vision_model.pre_layrnorm.bias', 'clip_l_vision.vision_model.pre_layrnorm.weight', 'clip_l_vision.visual_projection.weight']) Requested to load StableCascadeClipModel Loading 1 new model E:_AI\ComfyUI_standalone\ComfyUI\comfy\ldm\modules\attention.py:344: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.) out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False) Requested to load StableCascade_C Loading 1 new model 100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:02<00:00, 8.23it/s] Requested to load StageC_coder Loading 1 new model model_type STABLE_CASCADE adm 0 clip missing: ['clip_g.logit_scale'] Requested to load StableCascade_B Loading 1 new model 100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:02<00:00, 4.35it/s] Requested to load StageA Loading 1 new model Prompt executed in 28.90 seconds



![image](https://github.com/comfyanonymous/ComfyUI/assets/9952059/91dfa6ca-3980-4201-930d-602cf9c86482)

by the way, `Stability-AI/StableCascade` works pretty well but slow
comfyanonymous commented 6 months ago

Does anyone else have this issue? If not it might be a hardware/driver issue.

TurningTide commented 6 months ago

Does anyone else have this issue? If not it might be a hardware/driver issue.

Interestingly, yesterday I installed a fresh copy of Windows, while previously I had ComfyUI installed with experimental Cascade nodes - everything was working fine. After reinstalling the OS and ComfyUI, everything lost :. Only thing changed in hw - new ssd sata drive. Nvidia Driver version - 551.61 Game Ready.

any other sd model works without problems image

ashllay commented 6 months ago

Did you try lite models?

TurningTide commented 6 months ago

Did you try lite models?

Yep, every cascade model from the official repository that is outside the comfyui_checkpoints directory throws an error. Like

got prompt
ERROR:root:!!! Exception during processing !!!
ERROR:root:Traceback (most recent call last):
  File "E:\_AI\ComfyUI\execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\_AI\ComfyUI\execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\_AI\ComfyUI\execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\_AI\ComfyUI\nodes.py", line 540, in load_checkpoint
    out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\_AI\ComfyUI\comfy\sd.py", line 506, in load_checkpoint_guess_config
    model_config = model_detection.model_config_from_unet(sd, "model.diffusion_model.")
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\_AI\ComfyUI\comfy\model_detection.py", line 191, in model_config_from_unet
    unet_config = detect_unet_config(state_dict, unet_key_prefix)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\_AI\ComfyUI\comfy\model_detection.py", line 77, in detect_unet_config
    model_channels = state_dict['{}input_blocks.0.0.weight'.format(key_prefix)].shape[0]
                     ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'model.diffusion_model.input_blocks.0.0.weight'
TurningTide commented 6 months ago

Webui Forge + https://github.com/benjamin-bertram/sdweb-easy-stablecascade-diffusers = works

image.

So, Stability-AI/StableCascade, lllyasviel/stable-diffusion-webui-forge and ComfyUI - all were launched without a venv using a global instance of Python (3.11.8, Torch 2.1.2, CUDA 12.1). However, only ComfyUI refuses to yield the expected results.

TurningTide commented 6 months ago

I managed to run Cascade by loading bf16 models through separate UNET, CLIP, and VAE nodes.

image

This way there are no any warnings in console

got prompt
model_type STABLE_CASCADE
adm 0
Requested to load StableCascadeClipModel
Loading 1 new model
model_type STABLE_CASCADE
adm 0
Requested to load StableCascade_C
Loading 1 new model
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:02<00:00,  9.66it/s]
Requested to load StableCascade_B
Loading 1 new model
100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:02<00:00,  4.48it/s]
Requested to load StageA
Loading 1 new model
Prompt executed in 17.33 seconds

It's possible that there is something wrong with the special comfyui models?

frankchieng commented 6 months ago

Right from example:

image

No errors in console, just single warning clip missing: ['clip_g.logit_scale'].

ComfyUI latest version. Windows 11, Torch 2.1.2, CUDA 12.1, RTX 4x GPU.

delete your negative prompt and give it a shot again

TurningTide commented 6 months ago

delete your negative prompt and give it a shot again

nope. as you can see in my previous comment, there is a negative prompt and still great result

Arctomachine commented 6 months ago

I also run into model.diffusion_model.input_blocks.0.0.weight error on revision b7b55931 with lite bf16 models. Happens at loading model b Python 3.11.8 (from bat console output) Nvidia gtx 960/2, driver 536.23 Any other info needed?

JPGranizo commented 5 months ago

same error here!

ttulttul commented 5 months ago

Same error here.

huanmengmie commented 5 months ago

Same error here

winjvlee commented 5 months ago

这里同样的错误

h3clikejava commented 4 months ago

Same error here!!! Error occurred when executing unCLIPCheckpointLoader: model.diffusion_model.input_blocks.0.0.weight

use image prompt by cascade: https://github.com/ZHO-ZHO-ZHO/ComfyUI-Workflows-ZHO/blob/main/Stable%20Cascade%20ImagePrompt%20Standard%E3%80%90Zho%E3%80%91.json

yangquanbiubiu commented 4 weeks ago

did u fix it?I have the same error in both cascade and FLUX