[Bug]: Not working with xformers

nicolas-dufour commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What happened?

When running the UI with python launch.py --xformers, i get the following error when generating an image:

RuntimeError: shape '[8192, 1, 5]' is invalid for input of size 2621440

The error do not occure without the --xformers flag

Steps to reproduce the problem

launch with python launch.py --xformers
Use model 2.1 v 7689
Generate an image

What should have happened?

The model should have runned smoothly with xformers

Commit where the problem happens

38b7186e6e3a4dffc93225308b822f0dae43a47d

What platforms do you use to access UI ?

Linux

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

python launch.py --xformers

Additional information, context and logs

No response

ecce commented 1 year ago

Same here. However, I'm actually happy to have that problem after a marathon of trying to get xformers that far.

Ubuntu 22.04, CUDA 11.7, Nvidia driver 525 TORCH_COMMAND="pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117" FORCE_CUDA=1 pip install git+https://github.com/facebookresearch/xformers.git@v0.0.13#egg=xformers git @6faae2323963f9b0e0086a85b9d0472a24fbaa73

Installing collected packages: xformers
  Attempting uninstall: xformers
    Found existing installation: xformers 0.0.12
    Uninstalling xformers-0.0.12:
      Successfully uninstalled xformers-0.0.12
Successfully installed xformers-0.0.14.dev0

Without xformers everything is fine. Then when running with --xformers and prompt "a ball" with sd-v2.1-512:

  0%|                                                                                                                                                                                 | 0/20 [00:00<?, ?it/s]
Error completing request
Arguments: ('task(vxo6rs49iu6032j)', 'a ball', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 0, False, False, False, False, '', 1, '', 0, '', True, False, False) {}
Traceback (most recent call last):
  File "/home/mo/ai/sd/webui.test/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/home/mo/ai/sd/webui.test/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/home/mo/ai/sd/webui.test/modules/txt2img.py", line 52, in txt2img
    processed = process_images(p)
  File "/home/mo/ai/sd/webui.test/modules/processing.py", line 480, in process_images
    res = process_images_inner(p)
  File "/home/mo/ai/sd/webui.test/modules/processing.py", line 609, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/home/mo/ai/sd/webui.test/modules/processing.py", line 801, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/home/mo/ai/sd/webui.test/modules/sd_samplers.py", line 544, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/home/mo/ai/sd/webui.test/modules/sd_samplers.py", line 447, in launch_sampling
    return func()
  File "/home/mo/ai/sd/webui.test/modules/sd_samplers.py", line 544, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/mo/ai/sd/webui.test/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/mo/ai/sd/webui.test/modules/sd_samplers.py", line 337, in forward
    x_out = self.inner_model(x_in, sigma_in, cond={"c_crossattn": [cond_in], "c_concat": [image_cond_in]})
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/mo/ai/sd/webui.test/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "/home/mo/ai/sd/webui.test/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "/home/mo/ai/sd/webui.test/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/mo/ai/sd/webui.test/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1329, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/mo/ai/sd/webui.test/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 776, in forward
    h = module(h, emb, context)
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/mo/ai/sd/webui.test/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
    x = layer(x, context)
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/mo/ai/sd/webui.test/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 324, in forward
    x = block(x, context=context[i])
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/mo/ai/sd/webui.test/modules/sd_hijack_checkpoint.py", line 4, in BasicTransformerBlock_forward
    return checkpoint(self._forward, x, context)
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 249, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 107, in forward
    outputs = run_function(*args)
  File "/home/mo/ai/sd/webui.test/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 262, in _forward
    x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/mo/ai/sd/webui.test/modules/sd_hijack_optimizations.py", line 293, in xformers_attention_forward
    out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None)
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/xformers/ops.py", line 574, in memory_efficient_attention
    return op.forward_no_grad(
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/xformers/ops.py", line 315, in forward_no_grad
    return cls.forward(
  File "/home/mo/ai/sd/webui/venv/lib/python3.10/site-packages/xformers/ops.py", line 353, in forward
    query = query.reshape([batch * seqlen_q, 1, head_dim_q])
RuntimeError: shape '[8192, 1, 5]' is invalid for input of size 2621440

atensity commented 1 year ago

You could try installing xformers via pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers. The current head/main version is at 0.0.16 and it seems you're pulling 0.0.13 or similar. Similarly could check the output of python -m xformers.info

ecce commented 1 year ago

Indeed, installing the latest xformers solved my problem. FORCE_CUDA=1 is key btw. Also CUDA_HOME if cuda is installed outside the path. Thanks!

$ python3 -m xformers.info
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
xFormers 0.0.16+bc08bbc.d20230124
memory_efficient_attention.cutlassF:               available
memory_efficient_attention.cutlassB:               available
memory_efficient_attention.flshattF:               available
memory_efficient_attention.flshattB:               available
memory_efficient_attention.smallkF:                available
memory_efficient_attention.smallkB:                available
memory_efficient_attention.tritonflashattF:        unavailable
memory_efficient_attention.tritonflashattB:        unavailable
swiglu.fused.p.cpp:                                available
is_triton_available:                               False
is_functorch_available:                            False
pytorch.version:                                   1.13.1+cu117
pytorch.cuda:                                      available
gpu.compute_capability:                            8.6
gpu.name:                                          NVIDIA GeForce RTX 3080 Laptop GPU
build.info:                                        available
build.cuda_version:                                1107
build.python_version:                              3.10.6
build.torch_version:                               1.13.1+cu117
build.env.TORCH_CUDA_ARCH_LIST:                    None
build.env.XFORMERS_BUILD_TYPE:                     None
build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS:        None
build.env.NVCC_FLAGS:                              None
build.env.XFORMERS_PACKAGE_FROM:                   None
source.privacy:                                    open source

AshtakaOOf commented 1 year ago

Is anyone still having these issues? The latest pypi version of xformers doesn't crash the generation anymore for me, so if anyone here is still having issues with xformers try installing the latest.

AUTOMATIC1111 / stable-diffusion-webui