AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
136.38k stars 25.99k forks source link

[Bug]: NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: #7973

Open zhan-cn opened 1 year ago

zhan-cn commented 1 year ago

Is there an existing issue for this?

What happened?

when I run .\webui.bat --xformers or .\webui.bat --xformers --no-half --medvram,meet bug : NotImplementedError: No operator found for memory_efficient_attention_forward with inputs:

Steps to reproduce the problem

1 .\webui.bat --xformers --no-half --medvram 2 login in http://127.0.0.1:7860/ 3 choose jpg,then generate

What should have happened?

generate jgeg

Commit where the problem happens

.\webui.bat --xformers --no-half --medvram

What platforms do you use to access the UI ?

Windows

What browsers do you use to access the UI ?

Microsoft Edge

Command Line Arguments

no

List of extensions

no

Console logs

(venv) PS C:\stable-diffusion-webui> .\webui.bat --xformers
venv "C:\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash: 076d624a297532d6e4abebe5807fd7c7504d7a73
Installing requirements for Web UI
Launching Web UI with arguments: --xformers
Loading weights [d635794c1f] from C:\stable-diffusion-webui\models\Stable-diffusion\512-base-ema.ckpt
Creating model from config: C:\stable-diffusion-webui\models\Stable-diffusion\512-base-ema.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 865.91 M params.
Applying xformers cross attention optimization.
Textual inversion embeddings loaded(0):
Model loaded in 8.6s (load weights from disk: 3.3s, create model: 0.3s, apply weights to model: 1.0s, apply half(): 1.1s, move model to device: 0.9s, load textual inversion embeddings: 1.8s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Error completing request
Arguments: ('task(08ryrh9mj92xxtq)', 0, '', '', [], <PIL.Image.Image image mode=RGBA size=1099x1099 at 0x1F472FD9750>, None, None, None, None, None, None, 20, 0, 4, 0, 1, False, False, 1, 1, 7, 1.5, 0.75, -1.0, -1.0, 0, 0, 0, False, 512, 512, 0, 0, 32, 0, '', '', '', [], 0, '<ul>\n<li><code>CFG Scale</code> should be 2 or lower.</li>\n</ul>\n', True, True, '', '', True, 50, True, 1, 0, False, 4, 1, 'None', '<p style="margin-bottom:0.75em">Recommended settings: Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0.8</p>', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up', 'down'], False, False, 'positive', 'comma', 0, False, False, '', '<p style="margin-bottom:0.75em">Will upscale the image by the selected scale factor; use width and height sliders to set tile size</p>', 64, 0, 2, 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
  File "C:\stable-diffusion-webui\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "C:\stable-diffusion-webui\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "C:\stable-diffusion-webui\modules\img2img.py", line 171, in img2img
    processed = process_images(p)
  File "C:\stable-diffusion-webui\modules\processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "C:\stable-diffusion-webui\modules\processing.py", line 577, in process_images_inner
    p.init(p.all_prompts, p.all_seeds, p.all_subseeds)
  File "C:\stable-diffusion-webui\modules\processing.py", line 1017, in init
    self.init_latent = self.sd_model.get_first_stage_encoding(self.sd_model.encode_first_stage(image))
  File "C:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "C:\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "C:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 830, in encode_first_stage
    return self.first_stage_model.encode(x)
  File "C:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\autoencoder.py", line 83, in encode
    h = self.encoder(x)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 536, in forward
    h = self.mid.attn_1(h)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\model.py", line 258, in forward
    out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=self.attention_op)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\__init__.py", line 196, in memory_efficient_attention
    return _memory_efficient_attention(
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\__init__.py", line 292, in _memory_efficient_attention
    return _memory_efficient_attention_forward(
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\__init__.py", line 308, in _memory_efficient_attention_forward
    op = _dispatch_fw(inp)
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\dispatch.py", line 98, in _dispatch_fw
    return _run_priority_list(
  File "C:\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\dispatch.py", line 73, in _run_priority_list
    raise NotImplementedError(msg)
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(1, 4096, 1, 512) (torch.float16)
     key         : shape=(1, 4096, 1, 512) (torch.float16)
     value       : shape=(1, 4096, 1, 512) (torch.float16)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`cutlassF` is not supported because:
    xFormers wasn't build with CUDA support
`flshattF` is not supported because:
    xFormers wasn't build with CUDA support
    max(query.shape[-1] != value.shape[-1]) > 128
`tritonflashattF` is not supported because:
    xFormers wasn't build with CUDA support
    max(query.shape[-1] != value.shape[-1]) > 128
    triton is not available
    requires A100 GPU
`smallkF` is not supported because:
    xFormers wasn't build with CUDA support
    dtype=torch.float16 (supported: {torch.float32})
    max(query.shape[-1] != value.shape[-1]) > 32
    unsupported embed per head: 512

Additional information

have been rebulid xformers,I think maybe i use gtx 1650 4G

Setmaster commented 1 year ago

Having the same issue, RTX 3090.

Neefay commented 1 year ago

Same here, RTX 3080.

PandaBearz commented 1 year ago

same

SylveonBottle commented 1 year ago

Same issue, 1660-ti. I've tried using pip install, I've tried manually building, I've tried Xformers (windows installation, wiki), I've tried the xformers re-install argument, nothing's working.

SylveonBottle commented 1 year ago

Problem appears to have been resolved by following the steps in https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/6871#issuecomment-1416400288. Just delete venv folder and run webui with --xformers.

I've gone from getting 2.1s/it to 1.75s/it thanks to Xformers.

CreamyLong commented 1 year ago

Having the same issue, win10,RTX 3060,cuda11.1.

Natotela commented 1 year ago

appears to have been resolved by following the steps in #6871 (comment). Just delete venv folder and run webui with --xformers

I deleted the xformers site-packages from the venv and pip installed xformers==0.0.16 but when I ran the webui, it just installed the venv xformers back. xformers-0.0.16rc425.dist-info

Build cuda_11.8.r11.8/compiler.31833905_0 RTX 3070

zz412000428 commented 1 year ago

Same here, ubuntu18.04, RTX 3080 Ti, cuda12.1

zoezhu commented 1 year ago

I met this issue too, in my case, I pull new code and lauch webui, and new launch.py install xformer again, which version is xformers==0.0.16rc425 and not compiled with my cuda version, I just uninstall it and install xformer from source again, everything goes fine now. I also comment out the line run_pip(f"install -r \"{requirements_file}\"", "requirements for Web UI") in launch.py.

Natotela commented 1 year ago

Also a good advice would be to check everything goes on one python version. I had some difference between the PIP path which led to a 3.9v and the Py path which was 3.10

zz412000428 commented 1 year ago

@zoezhu God bless, it's successful

panta5 commented 1 year ago

I solved it by temporarily removing the --xformers flag. I'm penalized in speed, but so what.

Emekaborisama commented 1 year ago

I solved it by temporarily removing the --xformers flag. I'm penalized in speed, but so what.

sorry what did you remove? can you elaborate

Natotela commented 1 year ago
args

when you run webui.bat you have flags, command arguments, such as --no-half, or in many cases --xformers instructing the use of python lib xformers. so he launched it without the work of that lib.

jjangga0214 commented 1 year ago

This also happens with Apple Silicon(M1 max, Ventura 13.3.1(22E261))

olimjon-ibragimov commented 1 year ago

In my case, I set the version of xformers to 0.0.16rc425 in launch.py (line 228). And it seems to work.

nbollman commented 1 year ago

Im running the vladmandic/automatic version. Getting some requirements mismatches I had to adjust just to get the program to run, between versions for python/torch/torchvision/xformers intercompatibility. Getting similar error. Ubuntu 22.02 Ryzen5800x RTX3090

xformers installed: ubuntu-22.04-py3.10-torch2.0.0+cu118

Launching launch.py... 14:32:14-702100 INFO Starting SD.Next
14:32:14-704433 INFO Python 3.10.10 on Linux
14:32:14-724191 INFO Version: 99dc75c0 Fri May 5 09:28:44 2023 -0400
14:32:15-339630 INFO Latest published version: f6898c9aec9c8b40b55de52e1bf1b4b83028897d 2023-05-05T17:40:53Z
14:32:15-341186 INFO Setting environment tuning
14:32:15-342024 INFO nVidia CUDA toolkit detected
14:32:16-086818 INFO Torch 2.0.1+cu118
14:32:16-096932 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700
14:32:16-107578 INFO Torch detected GPU: NVIDIA GeForce RTX 3090 VRAM 24257 Arch (8, 6) Cores 82

...Blah blah, xformers loaded, start to generate image (shows in preview) and this poops out...

`NotImplementedError: No operator found formemory_efficient_attention_forwardwith inputs: query : shape=(1, 4096, 1, 512) (torch.float16) key : shape=(1, 4096, 1, 512) (torch.float16) value : shape=(1, 4096, 1, 512) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 cutlassFis not supported because: xFormers wasn't build with CUDA support Operator wasn't built - seepython -m xformers.infofor more info flshattFis not supported because: xFormers wasn't build with CUDA support max(query.shape[-1] != value.shape[-1]) > 128 Operator wasn't built - seepython -m xformers.infofor more info tritonflashattFis not supported because: xFormers wasn't build with CUDA support max(query.shape[-1] != value.shape[-1]) > 128 requires A100 GPU smallkFis not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn't built - seepython -m xformers.info` for more info unsupported embed per head: 512

Guess I shoulda but the A00? Im gonna try and build it, but am not sure how to activate the correct python venv for the project...

YoGalen commented 1 year ago

Problem appears to have been resolved by following the steps in #6871 (comment). Just delete venv folder and run webui with --xformers.

I've gone from getting 2.1s/it to 1.75s/it thanks to Xformers.

hello, where is venv directory?

Natotela commented 1 year ago

hello, where is venv directory

look in the stable-diffusion-webui dir for "venv"

wangchaofan2018 commented 1 year ago

I try pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers. It works!

al-swaiti commented 1 year ago

hi every one i fix the issue for LINUX USERS BY EDIT THIS FILE "stable-diffusion-webui/modules/launch_utils.py" and edit this line "xformers_package" to latest xformers package as below > xformers_package = os.environ.get('XFORMERS_PACKAGE', 'xformers==0.0.21.dev543') and relaunch the "./webui.sh --xformers" also windows users can try this way try it before delete venv folder if it doesn't work rename or delete venv and relunch webui with xformers flag

mikikokato commented 9 months ago

How do I resolve this error? I uninstalled it and installed it again, but it doesn't solve the problem.

NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(2, 4096, 8, 40) (torch.float16) key : shape=(2, 4096, 8, 40) (torch.float16) value : shape=(2, 4096, 8, 40) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 flshattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info tritonflashattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info triton is not available requires A100 GPU cutlassF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn't built - see python -m xformers.info for more info unsupported embed per head: 40

wasa4587 commented 8 months ago

Same here, RTX 3090 windows 10

theprogrammerknownasat commented 8 months ago

I have the same problem, 7900xtx, Fedora Linux

axel578 commented 7 months ago

same prblem rtx 3090, the problem occured for me since the beginning of sdxl

yohyama0216 commented 7 months ago

I used Google Colab and tried !pip install --pre -U xformers. it works!

elBlacksmith commented 7 months ago

I used Google Colab and tried !pip install --pre -U xformers. it works!

Thanks, that worked!

sibidibidi commented 6 months ago

Windows 11, gtx 1660s, r5 5600, 16gb, same problem

inchinet commented 6 months ago

win11, PC (not colab) install SD, same error, what is the solution ? delete venv folder and run webui with --xformers ?

XYZ-916 commented 5 months ago

I uninstall xformers and then install it solving the problem.

Hemanthkumar2112 commented 3 months ago

worked in kaggle platform !pip install --pre -U xformers

Ashutoshgaruda commented 1 month ago

try this pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu118

feedsbrain commented 1 month ago

If anyone still has issues with xformers in MacOS, here is what I did:

  1. Add --xformers in COMMANDLINE_ARGS in webui-user.sh
  2. Delete venv with rm -rf venv
  3. Run webui.sh with this command (using llvm from brew):
brew install llvm
CC=/usr/local/opt/llvm/bin/clang CXX=/usr/local/opt/llvm/bin/clang++ ./webui.sh
  1. xformers will be installed on the webui.sh launch

However, I think this will not work without CUDA. I'm looking if there's any alternatives to make it work for MPS.