AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
140.67k stars 26.61k forks source link

[Bug]: it's not possible to swich model on sd 2-1-768 if xformers enabled (NotImplementedError), but it's still possible to load it on startup #13763

Open light-and-ray opened 11 months ago

light-and-ray commented 11 months ago

Is there an existing issue for this?

What happened?

I start webui with not sd 2.1 model, then I try to load sd 2.1 checkpoint, and I get "NotImplementedError"

In the example below I loaded webui with previous used realisticVisionV51_v51VAE.safetensors (sd 1.5), and after that I switched model on sd_v2-1_768-ema-pruned.safetensors

Unsuccessful switching


################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on user user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
Using TCMalloc: libtcmalloc_minimal.so.4
Python 3.11.5 (main, Sep  2 2023, 14:16:33) [GCC 13.2.1 20230801]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Launching Web UI with arguments: --listen --enable-insecure-extension-access --disable-safe-unpickle --medvram-sdxl --api --loglevel WARNING --xformers --disable-all-extensions
*** "--disable-all-extensions" arg was used, will not load any extensions ***
Loading weights [15012c538f] from /home/user/ai-apps/stable-diffusion-webui/models/Stable-diffusion/realisticVisionV51_v51VAE.safetensors
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 5.9s (prepare environment: 1.2s, import torch: 2.2s, import gradio: 0.5s, setup paths: 0.8s, other imports: 0.3s, scripts list_optimizers: 0.2s, create ui: 0.4s).
Creating model from config: /home/user/ai-apps/stable-diffusion-webui/configs/v1-inference.yaml
Applying attention optimization: xformers... done.
Model loaded in 1.5s (load weights from disk: 0.7s, create model: 0.3s, apply weights to model: 0.4s).
Loading model sd_v2-1_768-ema-pruned.safetensors [dcd690123c] (2 out of 5)
Loading weights [dcd690123c] from /home/user/ai-apps/stable-diffusion-webui/models/Stable-diffusion/sd_v2-1_768-ema-pruned.safetensors
changing setting sd_model_checkpoint to sd_v2-1_768-ema-pruned.safetensors [dcd690123c]: NotImplementedError
Traceback (most recent call last):
  File "/home/user/ai-apps/stable-diffusion-webui/modules/options.py", line 140, in set
    option.onchange()
  File "/home/user/ai-apps/stable-diffusion-webui/modules/call_queue.py", line 13, in f
    res = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/modules/initialize_util.py", line 170, in <lambda>
    shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
                                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/modules/sd_models.py", line 732, in reload_model_weights
    sd_model = reuse_model_from_already_loaded(sd_model, checkpoint_info, timer)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/modules/sd_models.py", line 701, in reuse_model_from_already_loaded
    load_model(checkpoint_info)
  File "/home/user/ai-apps/stable-diffusion-webui/modules/sd_models.py", line 586, in load_model
    checkpoint_config = sd_models_config.find_checkpoint_config(state_dict, checkpoint_info)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/modules/sd_models_config.py", line 112, in find_checkpoint_config
    return guess_model_config_from_state_dict(state_dict, info.filename)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/modules/sd_models_config.py", line 87, in guess_model_config_from_state_dict
    elif is_using_v_parameterization_for_sd2(sd):
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/modules/sd_models_config.py", line 63, in is_using_v_parameterization_for_sd2
    out = (unet(x_test, torch.asarray([999], device=device), context=test_cond) - x_test).mean().item()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/modules/sd_unet.py", line 91, in UNetModel_forward
    return ldm.modules.diffusionmodules.openaimodel.copy_of_UNetModel_forward_for_webui(self, x, timesteps, context, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward
    h = module(h, emb, context)
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
    x = layer(x, context)
        ^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 334, in forward
    x = block(x, context=context[i])
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 269, in forward
    return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 121, in checkpoint
    return CheckpointFunction.apply(func, len(inputs), *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 136, in forward
    output_tensors = ctx.run_function(*ctx.input_tensors)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 272, in _forward
    x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 496, in xformers_attention_forward
    out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=get_xformers_flash_attention_op(q, k, v))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/xformers/ops/fmha/__init__.py", line 192, in memory_efficient_attention
    return _memory_efficient_attention(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/xformers/ops/fmha/__init__.py", line 290, in _memory_efficient_attention
    return _memory_efficient_attention_forward(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/xformers/ops/fmha/__init__.py", line 306, in _memory_efficient_attention_forward
    op = _dispatch_fw(inp)
         ^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/xformers/ops/fmha/dispatch.py", line 94, in _dispatch_fw
    return _run_priority_list(
           ^^^^^^^^^^^^^^^^^^^
  File "/home/user/ai-apps/stable-diffusion-webui/venv/lib/python3.11/site-packages/xformers/ops/fmha/dispatch.py", line 69, in _run_priority_list
    raise NotImplementedError(msg)
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(1, 64, 5, 64) (torch.float32)
     key         : shape=(1, 64, 5, 64) (torch.float32)
     value       : shape=(1, 64, 5, 64) (torch.float32)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`flshattF` is not supported because:
    device=cpu (supported: {'cuda'})
    dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
`tritonflashattF` is not supported because:
    device=cpu (supported: {'cuda'})
    dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})
`cutlassF` is not supported because:
    device=cpu (supported: {'cuda'})
`smallkF` is not supported because:
    max(query.shape[-1] != value.shape[-1]) > 32
    unsupported embed per head: 64

It is possible to load the model with no using --xformers (but in this case sd 2.1 requires --no-half argument, аnd it is disaster), or using --xformers, if 2.1 was the last used model

Or I can just provide --ckpt models/Stable-diffusion/sd_v2-1_768-ema-pruned.safetensors argument:

Successful load on startup


################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on user user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
Using TCMalloc: libtcmalloc_minimal.so.4
Python 3.11.5 (main, Sep  2 2023, 14:16:33) [GCC 13.2.1 20230801]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Launching Web UI with arguments: --listen --enable-insecure-extension-access --disable-safe-unpickle --medvram-sdxl --api --loglevel WARNING --xformers --disable-all-extensions --ckpt models/Stable-diffusion/sd_v2-1_768-ema-pruned.safetensors
*** "--disable-all-extensions" arg was used, will not load any extensions ***
Loading weights [dcd690123c] from /home/user/ai-apps/stable-diffusion-webui/models/Stable-diffusion/sd_v2-1_768-ema-pruned.safetensors
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 5.8s (prepare environment: 1.1s, import torch: 2.2s, import gradio: 0.5s, setup paths: 0.8s, other imports: 0.3s, scripts list_optimizers: 0.2s, create ui: 0.4s).
Creating model from config: /home/user/ai-apps/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/configs/stable-diffusion/v2-inference-v.yaml
Applying attention optimization: xformers... done.
Model loaded in 2.9s (load weights from disk: 0.7s, find config: 0.7s, apply weights to model: 1.3s).

And the model works. And I can switch them with no problems

Steps to reproduce the problem

  1. start webui with sd 1.5 checkpoint and --xformers
  2. select sd 2.1 checkpoint in drop-down menu

What should have happened?

error

Sysinfo

sysinfo-2023-10-25-19-14.txt

sysinfo ignored --disable-all-extensions for some reason ("disable_all_extensions": "none",)

What browsers do you use to access the UI ?

Mozilla Firefox

Console logs

-

Additional information

No response

light-and-ray commented 11 months ago

p.s. I've done --reinstall-xformers before write this issue

light-and-ray commented 3 months ago

NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(1, 64, 5, 64) (torch.float32) key : shape=(1, 64, 5, 64) (torch.float32) value : shape=(1, 64, 5, 64) (torch.float32) attn_bias : <class 'NoneType'> p : 0.0 decoderF is not supported because: device=cpu (supported: {'cuda'}) attn_bias type is <class 'NoneType'>

Also sd_hijack_optimizations.py:497:

out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=get_xformers_flash_attention_op(q, k, v))

light-and-ray commented 3 months ago

But there are other is not supported because:

decoderF is not supported because: device=cpu (supported: {'cuda'}) attn_bias type is <class 'NoneType'> flshattF@v2.3.6 is not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.float16, torch.bfloat16}) tritonflashattF is not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.float16, torch.bfloat16}) operator wasn't built - see python -m xformers.info for more info triton is not available Only work on pre-MLIR triton for now cutlassF is not supported because: device=cpu (supported: {'cuda'}) smallkF is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 device=cpu (supported: {'cuda'}) unsupported embed per head: 64