AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
143.6k stars 27.03k forks source link

[Bug]: loading SDXL1.0 vae error #12548

Closed BXuan694 closed 1 year ago

BXuan694 commented 1 year ago

Is there an existing issue for this?

What happened?

similar to #11803, but no use with the answers.

I failed loading SDXL1.0's vae weights reporting that many keys of safetensor file missed. image

However, I checked the model structure and found the layers reported missing, such as 'conditioner.embedders.0.wrapped.transformer.text_model.embeddings.token_embedding.wrapped.weight'

the base checkpoint works fine in text2image.

Steps to reproduce the problem

  1. download the base and vae files from official huggingface page to the right path.
  2. select SD checkpoint 'sd_xl_base_1.0.safetensors [31e35c80fc]'
  3. select SD vae 'sd_xl_base_1.0_0.9vae.safetensors' and bug will report.

What should have happened?

The SDXL 1.0 VAE loads normally.

Version or Commit where the problem happens

v1.5.1

What Python version are you running on ?

None

What platforms do you use to access the UI ?

Linux

What device are you running WebUI on?

Nvidia GPUs (RTX 20 above)

Cross attention optimization

Automatic

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

bash webui.sh --listen --enable-insecure-extension-access --xformers --enable-console-prompts --no-half-vae

List of extensions

LDSR Lora ScuNET SwinIR canvas-zoom-and-pan extra-options-section mobile prompt-bracket-checker

Console logs

changing setting sd_vae to _sd_xl_base_1.0_0.9vae.safetensors: RuntimeError
Traceback (most recent call last):
  File "/data1/sd/sdxl/stable-diffusion-webui/modules/shared.py", line 633, in set
    self.data_labels[key].onchange()
  File "/data1/sd/sdxl/stable-diffusion-webui/modules/call_queue.py", line 14, in f
    res = func(*args, **kwargs)
  File "/data1/sd/sdxl/stable-diffusion-webui/webui.py", line 239, in <lambda>
    shared.opts.onchange("sd_vae", wrap_queued_call(lambda: modules.sd_vae.reload_vae_weights()), call=False)
  File "/data1/sd/sdxl/stable-diffusion-webui/modules/sd_vae.py", line 217, in reload_vae_weights
    load_vae(sd_model, vae_file, vae_source)
  File "/data1/sd/sdxl/stable-diffusion-webui/modules/sd_vae.py", line 143, in load_vae
    _load_vae_dict(model, vae_dict_1)
  File "/data1/sd/sdxl/stable-diffusion-webui/modules/sd_vae.py", line 180, in _load_vae_dict
    model.load_state_dict(vae_dict_1)
  File "/data1/sd/sdxl/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DiffusionEngine:
    Missing key(s) in state_dict: "denoiser.sigmas", "conditioner.embedders.0.wrapped.transformer.text_model.embeddings.position_ids", "conditioner.embedders.0.wrapped.transformer.text_model.embeddings.token_embedding.wrapped.weight", "conditioner.embedders.0.wrapped.transformer.text_model.embeddings.position_embedding.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.self_attn.k_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.self_attn.k_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.self_attn.v_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.self_attn.v_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.self_attn.q_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.self_attn.q_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.self_attn.out_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.self_attn.out_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.layer_norm1.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.layer_norm1.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.mlp.fc1.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.mlp.fc1.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.mlp.fc2.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.mlp.fc2.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.layer_norm2.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.0.layer_norm2.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.self_attn.k_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.self_attn.k_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.self_attn.v_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.self_attn.v_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.self_attn.q_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.self_attn.q_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.self_attn.out_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.self_attn.out_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.layer_norm1.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.layer_norm1.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.mlp.fc1.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.mlp.fc1.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.mlp.fc2.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.mlp.fc2.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.layer_norm2.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.1.layer_norm2.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.self_attn.k_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.self_attn.k_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.self_attn.v_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.self_attn.v_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.self_attn.q_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.self_attn.q_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.self_attn.out_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.self_attn.out_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.layer_norm1.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.layer_norm1.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.mlp.fc1.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.mlp.fc1.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.mlp.fc2.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.mlp.fc2.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.layer_norm2.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.2.layer_norm2.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.self_attn.k_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.self_attn.k_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.self_attn.v_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.self_attn.v_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.self_attn.q_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.self_attn.q_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.self_attn.out_proj.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.self_attn.out_proj.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.layer_norm1.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.layer_norm1.bias", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.mlp.fc1.weight", "conditioner.embedders.0.wrapped.transformer.text_model.encoder.layers.3.mlp.fc1.bias", 
......
"conditioner.embedders.1.model.transformer.resblocks.9.mlp.c_proj.weight".

Additional information

No response

catboxanon commented 1 year ago

sd_xl_base_1.0_0.9vae.safetensors is also a checkpoint, not a VAE. When using the SDXL model the VAE should be set to Automatic.

In fact, for the checkpoint, that model should be the one preferred to use, not sd_xl_base_1.0.safetensors. See https://github.com/huggingface/diffusers/issues/4310 for some background.