AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
140.8k stars 26.63k forks source link

Xformers fails to install on linux with a GTX 1650 #2303

Closed drax-xard closed 1 year ago

drax-xard commented 1 year ago

Describe the bug

Bash output ################################################################ Install script for stable-diffusion + Web UI Tested on Debian 11 (Bullseye) ################################################################

################################################################ Running on user user ################################################################

################################################################ Repo already cloned, using it as install directory ################################################################

################################################################ Clone or update stable-diffusion-webui ################################################################ remote: Enumerating objects: 17, done. remote: Counting objects: 100% (17/17), done. remote: Compressing objects: 100% (12/12), done. remote: Total 17 (delta 8), reused 7 (delta 5), pack-reused 0 Unpacking objects: 100% (17/17), 378.38 KiB | 2.03 MiB/s, done. From https://github.com/AUTOMATIC1111/stable-diffusion-webui 2f6ea2f..6be32b3 master -> origin/master 5f33173..66ec505 embed-embeddings-in-images -> origin/embed-embeddings-in-images Updating 2f6ea2f..6be32b3 Fast-forward modules/hypernetworks/ui.py | 2 +- modules/textual_inversion/ui.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)

################################################################ Create and activate python venv ################################################################

################################################################ Launching launch.py... ################################################################ Python 3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0] Commit hash: 6be32b31d181e42c639dad3451229aa7b9cfd1cf Installing xformers Traceback (most recent call last): File "/home/user/stable-diffusion-webui/launch.py", line 168, in prepare_enviroment() File "/home/user/stable-diffusion-webui/launch.py", line 133, in prepare_enviroment run_pip("install xformers", "xformers") File "/home/user/stable-diffusion-webui/launch.py", line 60, in run_pip return run(f'"{python}" -m pip {args} --prefer-binary', desc=f"Installing {desc}", errdesc=f"Couldn't install {desc}") File "/home/user/stable-diffusion-webui/launch.py", line 32, in run raise RuntimeError(message) RuntimeError: Couldn't install xformers. Command: "/home/user/stable-diffusion-webui/venv/bin/python3" -m pip install xformers --prefer-binary Error code: 1 stdout: Collecting xformers Using cached xformers-0.0.13.tar.gz (292 kB) Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'error'

stderr: error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [8 lines of output] Traceback (most recent call last): File "", line 2, in File "", line 34, in File "/tmp/pip-install-5tiej0yr/xformers_85e3b7463416425facb77f69ff25b085/setup.py", line 239, in ext_modules=get_extensions(), File "/tmp/pip-install-5tiej0yr/xformers_85e3b7463416425facb77f69ff25b085/setup.py", line 157, in get_extensions raise RuntimeError( RuntimeError: CUTLASS submodule not found. Did you forget to run git submodule update --init --recursive ? [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

× Encountered error while generating package metadata. ╰─> See above for output.

note: This is an issue with the package mentioned above, not pip. hint: See above for details.

Desktop:

Additional context Tried to install manually using a wheel (had to rename for python 3.10), sort of installed but when I tried to generate it fails with an error about "Triton" (?)

drax-xard commented 1 year ago

When I install using the renamed binary wheel this is the output with the error at launch:

Launching Web UI with arguments: --opt-split-attention --medvram --xformers WARNING:root:A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton'

And for the failed generation:

0%| | 0/20 [00:01<?, ?it/s] Error completing request Arguments: ('sad', '', 'None', 'None', 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, False, 0.7, 0, False, False, None, '', '', False, [], [], [], [], 1, '', 0, '', True, False) {} Traceback (most recent call last): File "/home/user/stable-diffusion-webui/modules/ui.py", line 182, in f res = list(func(*args, kwargs)) File "/home/user/stable-diffusion-webui/webui.py", line 69, in f res = func(args, kwargs) File "/home/user/stable-diffusion-webui/modules/txt2img.py", line 43, in txt2img processed = process_images(p) File "/home/user/stable-diffusion-webui/modules/processing.py", line 404, in process_images samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength) File "/home/user/stable-diffusion-webui/modules/processing.py", line 532, in sample samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning) File "/home/user/stable-diffusion-webui/modules/sd_samplers.py", line 409, in sample samples = self.func(self.model_wrap_cfg, x, extra_args={'cond': conditioning, 'uncond': unconditional_conditioning, 'cond_scale': p.cfg_scale}, disable=False, callback=self.callback_state, extra_params_kwargs) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "/home/user/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 80, in sample_euler_ancestral denoised = model(x, sigmas[i] * s_in, extra_args) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/home/user/stable-diffusion-webui/modules/sd_samplers.py", line 245, in forward x_out[a:b] = self.inner_model(x_in[a:b], sigma_in[a:b], cond=cond_in[a:b]) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/home/user/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), kwargs) File "/home/user/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps return self.inner_model.apply_model(args, kwargs) File "/home/user/stable-diffusion-webui/repositories/stable-diffusion/ldm/models/diffusion/ddpm.py", line 987, in apply_model x_recon = self.model(x_noisy, t, cond) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1148, in _call_impl result = forward_call(input, kwargs) File "/home/user/stable-diffusion-webui/repositories/stable-diffusion/ldm/models/diffusion/ddpm.py", line 1410, in forward out = self.diffusion_model(x, t, context=cc) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/home/user/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/openaimodel.py", line 732, in forward h = module(h, emb, context) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/home/user/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/openaimodel.py", line 85, in forward x = layer(x, context) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/home/user/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/attention.py", line 258, in forward x = block(x, context=context) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/home/user/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/attention.py", line 209, in forward return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint) File "/home/user/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 114, in checkpoint return CheckpointFunction.apply(func, len(inputs), args) File "/home/user/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 127, in forward output_tensors = ctx.run_function(ctx.input_tensors) File "/home/user/stable-diffusion-webui/repositories/stable-diffusion/ldm/modules/attention.py", line 212, in _forward x = self.attn1(self.norm1(x)) + x File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/home/user/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 227, in xformers_attention_forward out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None) File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/xformers/ops.py", line 574, in memory_efficient_attention return op.forward_no_grad( File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/xformers/ops.py", line 189, in forward_no_grad return cls.FORWARD_OPERATOR( File "/home/user/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_ops.py", line 143, in call return self._op(args, kwargs or {}) RuntimeError: Expected query.dim() == 3 to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.)

dogarrowtype commented 1 year ago

Had this exact issue with a 2070 Super on Arch Linux. Spent an hour or so troubleshooting. Figured out it was the version of xformers. It needs to be complied from the current dev build to work, and that can be done with pip using: sudo pip3 install git+https://github.com/facebookresearch/xformers.git#egg=xformers

drax-xard commented 1 year ago

Ok, I uninstalled xformers, then did the build with your command, installed ok, launched ok without errors but when I went to generate it failed with this output:

NotImplementedError: Could not run 'xformers::efficient_attention_forward_cutlass' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'xformers::efficient_attention_forward_cutlass' is only available for these backends: [UNKNOWN_TENSOR_TYPE_ID, QuantizedXPU, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, SparseCPU, SparseCUDA, SparseHIP, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, SparseVE, UNKNOWN_TENSOR_TYPE_ID, NestedTensorCUDA, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID].

BackendSelect: fallthrough registered at ../aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Python: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:133 [backend fallback] Named: registered at ../aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] Conjugate: registered at ../aten/src/ATen/ConjugateFallback.cpp:18 [backend fallback] Negative: registered at ../aten/src/ATen/native/NegateFallback.cpp:18 [backend fallback] ZeroTensor: registered at ../aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback] ADInplaceOrView: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:64 [backend fallback] AutogradOther: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:35 [backend fallback] AutogradCPU: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:39 [backend fallback] AutogradCUDA: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:47 [backend fallback] AutogradXLA: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:51 [backend fallback] AutogradMPS: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:59 [backend fallback] AutogradXPU: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:43 [backend fallback] AutogradHPU: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:68 [backend fallback] AutogradLazy: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:55 [backend fallback] Tracer: registered at ../torch/csrc/autograd/TraceTypeManual.cpp:295 [backend fallback] AutocastCPU: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:481 [backend fallback] Autocast: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:324 [backend fallback] Batched: registered at ../aten/src/ATen/BatchingRegistrations.cpp:1064 [backend fallback] VmapMode: fallthrough registered at ../aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback] Functionalize: registered at ../aten/src/ATen/FunctionalizeFallbackKernel.cpp:89 [backend fallback] PythonTLSSnapshot: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:137 [backend fallback]

Thanks for the tip though, will keep troubleshooting.

drax-xard commented 1 year ago

Solved it: downloaded the wheel directly from C43H66N12O12S2's repo: https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/linux/xformers-0.0.14.dev0-cp310-cp310-linux_x86_64.whl installed it and it launched and generated ok Not great speed gains (3.12 s/it versus 3.6 s/it) but at least it works.