AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
143.05k stars 26.96k forks source link

[Bug]: NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: #7973

Open zhan-cn opened 1 year ago

zhan-cn commented 1 year ago

Is there an existing issue for this?

What happened?

when I run .\webui.bat --xformers or .\webui.bat --xformers --no-half --medvram,meet bug : NotImplementedError: No operator found for memory_efficient_attention_forward with inputs:

Steps to reproduce the problem

1 .\webui.bat --xformers --no-half --medvram 2 login in http://127.0.0.1:7860/ 3 choose jpg,then generate

What should have happened?

generate jgeg

Commit where the problem happens

.\webui.bat --xformers --no-half --medvram

What platforms do you use to access the UI ?

Windows

What browsers do you use to access the UI ?

Microsoft Edge

Command Line Arguments

no

List of extensions

no

Console logs

(venv) PS C:\stable-diffusion-webui> .\webui.bat --xformers
venv "C:\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash: 076d624a297532d6e4abebe5807fd7c7504d7a73
Installing requirements for Web UI
Launching Web UI with arguments: --xformers
Loading weights [d635794c1f] from C:\stable-diffusion-webui\models\Stable-diffusion\512-base-ema.ckpt
Creating model from config: C:\stable-diffusion-webui\models\Stable-diffusion\512-base-ema.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 865.91 M params.
Applying xformers cross attention optimization.
Textual inversion embeddings loaded(0):
Model loaded in 8.6s (load weights from disk: 3.3s, create model: 0.3s, apply weights to model: 1.0s, apply half(): 1.1s, move model to device: 0.9s, load textual inversion embeddings: 1.8s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Error completing request
Arguments: ('task(08ryrh9mj92xxtq)', 0, '', '', [], <PIL.Image.Image image mode=RGBA size=1099x1099 at 0x1F472FD9750>, None, None, None, None, None, None, 20, 0, 4, 0, 1, False, False, 1, 1, 7, 1.5, 0.75, -1.0, -1.0, 0, 0, 0, False, 512, 512, 0, 0, 32, 0, '', '', '', [], 0, '<ul>\n<li><code>CFG Scale</code> should be 2 or lower.</li>\n</ul>\n', True, True, '', '', True, 50, True, 1, 0, False, 4, 1, 'None', '<p style="margin-bottom:0.75em">Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8</p>', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, '', '<p style="margin-bottom:0.75em">Will upscale the image by the selected scale factor; use width and height sliders to set tile size</p>', 64, 0, 2, 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "C:\stable-diffusion-webui\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "C:\stable-diffusion-webui\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "C:\stable-diffusion-webui\modules\img2img.py", line 171, in img2img
    processed = process_images(p)
  File "C:\stable-diffusion-webui\modules\processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "C:\stable-diffusion-webui\modules\processing.py", line 577, in process_images_inner
    p.init(p.all_prompts, p.all_seeds, p.all_subseeds)
  File "C:\stable-diffusion-webui\modules\processing.py", line 1017, in init
    self.init_latent = self.sd_model.get_first_stage_encoding(self.sd_model.encode_first_stage(image))
  File "C:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "C:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "C:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 830, in encode_first_stage
    return self.first_stage_model.encode(x)
  File "C:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 83, in encode
    h = self.encoder(x)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 536, in forward
    h = self.mid.attn_1(h)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 258, in forward
    out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=self.attention_op)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\__init__.py", line 196, in memory_efficient_attention
    return _memory_efficient_attention(
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\__init__.py", line 292, in _memory_efficient_attention
    return _memory_efficient_attention_forward(
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\__init__.py", line 308, in _memory_efficient_attention_forward
    op = _dispatch_fw(inp)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\dispatch.py", line 98, in _dispatch_fw
    return _run_priority_list(
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\dispatch.py", line 73, in _run_priority_list
    raise NotImplementedError(msg)
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(1, 4096, 1, 512) (torch.float16)
     key         : shape=(1, 4096, 1, 512) (torch.float16)
     value       : shape=(1, 4096, 1, 512) (torch.float16)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`cutlassF` is not supported because:
    xFormers wasn't build with CUDA support
`flshattF` is not supported because:
    xFormers wasn't build with CUDA support
    max(query.shape[-1] != value.shape[-1]) > 128
`tritonflashattF` is not supported because:
    xFormers wasn't build with CUDA support
    max(query.shape[-1] != value.shape[-1]) > 128
    triton is not available
    requires A100 GPU
`smallkF` is not supported because:
    xFormers wasn't build with CUDA support
    dtype=torch.float16 (supported: {torch.float32})
    max(query.shape[-1] != value.shape[-1]) > 32
    unsupported embed per head: 512

Additional information

have been rebulid xformers,I think maybe i use gtx 1650 4G

Setmaster commented 1 year ago

Having the same issue, RTX 3090.

Neefay commented 1 year ago

Same here, RTX 3080.

PandaBearz commented 1 year ago

same

SeasonalFerret commented 1 year ago

Same issue, 1660-ti. I've tried using pip install, I've tried manually building, I've tried Xformers (windows installation, wiki), I've tried the xformers re-install argument, nothing's working.

SeasonalFerret commented 1 year ago

Problem appears to have been resolved by following the steps in https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/6871#issuecomment-1416400288. Just delete venv folder and run webui with --xformers.

I've gone from getting 2.1s/it to 1.75s/it thanks to Xformers.

CreamyLong commented 1 year ago

Having the same issue, win10,RTX 3060,cuda11.1.

Natotela commented 1 year ago

appears to have been resolved by following the steps in #6871 (comment). Just delete venv folder and run webui with --xformers

I deleted the xformers site-packages from the venv and pip installed xformers==0.0.16 but when I ran the webui, it just installed the venv xformers back. xformers-0.0.16rc425.dist-info

Build cuda_11.8.r11.8/compiler.31833905_0 RTX 3070

zz412000428 commented 1 year ago

Same here, ubuntu18.04, RTX 3080 Ti, cuda12.1

zoezhu commented 1 year ago

I met this issue too, in my case, I pull new code and lauch webui, and new launch.py install xformer again, which version is xformers==0.0.16rc425 and not compiled with my cuda version, I just uninstall it and install xformer from source again, everything goes fine now. I also comment out the line run_pip(f"install -r \"{requirements_file}\"", "requirements for Web UI") in launch.py.

Natotela commented 1 year ago

Also a good advice would be to check everything goes on one python version. I had some difference between the PIP path which led to a 3.9v and the Py path which was 3.10

zz412000428 commented 1 year ago

@zoezhu God bless, it's successful

panta5 commented 1 year ago

I solved it by temporarily removing the --xformers flag. I'm penalized in speed, but so what.

Emekaborisama commented 1 year ago

I solved it by temporarily removing the --xformers flag. I'm penalized in speed, but so what.

sorry what did you remove? can you elaborate

Natotela commented 1 year ago
args

when you run webui.bat you have flags, command arguments, such as --no-half, or in many cases --xformers instructing the use of python lib xformers. so he launched it without the work of that lib.

jjangga0214 commented 1 year ago

This also happens with Apple Silicon(M1 max, Ventura 13.3.1(22E261))

olim-ibragimov commented 1 year ago

In my case, I set the version of xformers to 0.0.16rc425 in launch.py (line 228). And it seems to work.

nbollman commented 1 year ago

Im running the vladmandic/automatic version. Getting some requirements mismatches I had to adjust just to get the program to run, between versions for python/torch/torchvision/xformers intercompatibility. Getting similar error. Ubuntu 22.02 Ryzen5800x RTX3090

xformers installed: ubuntu-22.04-py3.10-torch2.0.0+cu118

Launching launch.py... 14:32:14-702100 INFO Starting SD.Next
14:32:14-704433 INFO Python 3.10.10 on Linux
14:32:14-724191 INFO Version: 99dc75c0 Fri May 5 09:28:44 2023 -0400
14:32:15-339630 INFO Latest published version: f6898c9aec9c8b40b55de52e1bf1b4b83028897d 2023-05-05T17:40:53Z
14:32:15-341186 INFO Setting environment tuning
14:32:15-342024 INFO nVidia CUDA toolkit detected
14:32:16-086818 INFO Torch 2.0.1+cu118
14:32:16-096932 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700
14:32:16-107578 INFO Torch detected GPU: NVIDIA GeForce RTX 3090 VRAM 24257 Arch (8, 6) Cores 82

...Blah blah, xformers loaded, start to generate image (shows in preview) and this poops out...

`NotImplementedError: No operator found formemory_efficient_attention_forwardwith inputs: query : shape=(1, 4096, 1, 512) (torch.float16) key : shape=(1, 4096, 1, 512) (torch.float16) value : shape=(1, 4096, 1, 512) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 cutlassFis not supported because: xFormers wasn't build with CUDA support Operator wasn't built - seepython -m xformers.infofor more info flshattFis not supported because: xFormers wasn't build with CUDA support max(query.shape[-1] != value.shape[-1]) > 128 Operator wasn't built - seepython -m xformers.infofor more info tritonflashattFis not supported because: xFormers wasn't build with CUDA support max(query.shape[-1] != value.shape[-1]) > 128 requires A100 GPU smallkFis not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn't built - seepython -m xformers.info` for more info unsupported embed per head: 512

Guess I shoulda but the A00? Im gonna try and build it, but am not sure how to activate the correct python venv for the project...

YoGalen commented 1 year ago

Problem appears to have been resolved by following the steps in #6871 (comment). Just delete venv folder and run webui with --xformers.

I've gone from getting 2.1s/it to 1.75s/it thanks to Xformers.

hello, where is venv directory?

Natotela commented 1 year ago

hello, where is venv directory

look in the stable-diffusion-webui dir for "venv"

wangchaofan2018 commented 1 year ago

I try pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers. It works!

al-swaiti commented 1 year ago

hi every one i fix the issue for LINUX USERS BY EDIT THIS FILE "stable-diffusion-webui/modules/launch_utils.py" and edit this line "xformers_package" to latest xformers package as below > xformers_package = os.environ.get('XFORMERS_PACKAGE', 'xformers==0.0.21.dev543') and relaunch the "./webui.sh --xformers" also windows users can try this way try it before delete venv folder if it doesn't work rename or delete venv and relunch webui with xformers flag

mikikokato commented 1 year ago

How do I resolve this error? I uninstalled it and installed it again, but it doesn't solve the problem.

NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(2, 4096, 8, 40) (torch.float16) key : shape=(2, 4096, 8, 40) (torch.float16) value : shape=(2, 4096, 8, 40) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 flshattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info tritonflashattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info triton is not available requires A100 GPU cutlassF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn't built - see python -m xformers.info for more info unsupported embed per head: 40

wasa4587 commented 1 year ago

Same here, RTX 3090 windows 10

theprogrammerknownasat commented 1 year ago

I have the same problem, 7900xtx, Fedora Linux

axel578 commented 11 months ago

same prblem rtx 3090, the problem occured for me since the beginning of sdxl

yohyama0216 commented 11 months ago

I used Google Colab and tried !pip install --pre -U xformers. it works!

elBlacksmith commented 11 months ago

I used Google Colab and tried !pip install --pre -U xformers. it works!

Thanks, that worked!

sibidibidi commented 10 months ago

Windows 11, gtx 1660s, r5 5600, 16gb, same problem

inchinet commented 10 months ago

win11, PC (not colab) install SD, same error, what is the solution ? delete venv folder and run webui with --xformers ?

XYZ-916 commented 9 months ago

I uninstall xformers and then install it solving the problem.

Hemanthkumar2112 commented 7 months ago

worked in kaggle platform !pip install --pre -U xformers

Ashutoshgaruda commented 5 months ago

try this pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu118

feedsbrain commented 5 months ago

If anyone still has issues with xformers in MacOS, here is what I did:

  1. Add --xformers in COMMANDLINE_ARGS in webui-user.sh
  2. Delete venv with rm -rf venv
  3. Run webui.sh with this command (using llvm from brew):
brew install llvm
CC=/usr/local/opt/llvm/bin/clang CXX=/usr/local/opt/llvm/bin/clang++ ./webui.sh
  1. xformers will be installed on the webui.sh launch

However, I think this will not work without CUDA. I'm looking if there's any alternatives to make it work for MPS.