Cyberes / xformers-compiled

xformers compiled for specific graphics cards.
MIT License
19 stars 6 forks source link

A4000 No such operator xformers_flash::flash_fwd #8

Open pbuyle opened 1 year ago

pbuyle commented 1 year ago

When using https://raw.githubusercontent.com/Cyberes/xformers-compiled/main/a4000/xformers-0.0.16%2B6f3c20f.d20230127-cp39-cp39-linux_x86_64.whl in the Gradient Notebook for Stable Diffusion webui, I get the following error whenever I try to increase batch size. The same batch size works fine with the older 0.0.14 release of the wheel.

Traceback (most recent call last):
  File "/notebooks/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/notebooks/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/notebooks/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/notebooks/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/notebooks/stable-diffusion-webui/modules/processing.py", line 628, in process_images_inner
    samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
  File "/notebooks/stable-diffusion-webui/modules/processing.py", line 828, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/notebooks/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 323, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/notebooks/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 221, in launch_sampling
    return func()
  File "/notebooks/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 323, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/usr/local/lib/python3.9/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/notebooks/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/notebooks/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 116, in forward
    x_out = self.inner_model(x_in, sigma_in, cond={"c_crossattn": [cond_in], "c_concat": [image_cond_in]})
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/notebooks/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "/notebooks/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "/notebooks/stable-diffusion-webui/modules/sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "/notebooks/stable-diffusion-webui/modules/sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "/notebooks/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/notebooks/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1329, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/notebooks/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 776, in forward
    h = module(h, emb, context)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/notebooks/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
    x = layer(x, context)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/notebooks/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 324, in forward
    x = block(x, context=context[i])
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/notebooks/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 259, in forward
    return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
  File "/notebooks/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 114, in checkpoint
    return CheckpointFunction.apply(func, len(inputs), *args)
  File "/notebooks/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 129, in forward
    output_tensors = ctx.run_function(*ctx.input_tensors)
  File "/notebooks/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 262, in _forward
    x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/notebooks/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 342, in xformers_attention_forward
    out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=get_xformers_flash_attention_op(q, k, v))
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/fmha/__init__.py", line 197, in memory_efficient_attention
    return _memory_efficient_attention(
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/fmha/__init__.py", line 293, in _memory_efficient_attention
    return _memory_efficient_attention_forward(
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/fmha/__init__.py", line 313, in _memory_efficient_attention_forward
    out, *_ = op.apply(inp, needs_gradient=False)
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/fmha/flash.py", line 240, in apply
    out, softmax_lse = cls.OPERATOR(
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/common.py", line 13, in no_such_operator
    raise RuntimeError(
RuntimeError: No such operator xformers_flash::flash_fwd - did you forget to build xformers with `python setup.py develop`?
Cyberes commented 1 year ago

Yeah, I built that one using the minified version so it doesn't have that feature. Just use https://raw.githubusercontent.com/Cyberes/xformers-compiled/main/a4000/full/xformers-0.0.16%2B6f3c20f.d20230130-cp39-cp39-linux_x86_64.whl until I get around to rebuilding them.