sagiodev / stablediffusion_webui

27 stars 13 forks source link

No operator found on Colab? #5

Closed quintendewilde closed 1 year ago

quintendewilde commented 1 year ago

What's going wrong I'm runnign with A100 on pro Colab?

Error: 'No operator found for memory_efficient_attention_forward with inputs: query : shape=(1, 4096, 8, 40) (torch.float16) key : shape=(1, 4096, 8, 40) (torch.float16) value : shape=(1, 4096, 8, 40) (torch.float16) attn_bias : p : 0.0 flshattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info requires a GPU with compute capability > 7.5 tritonflashattF is not supported because: xFormers wasn't build with CUDA support requires A100 GPU cutlassF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn't built - see python -m xformers.info for more info unsupported embed per head: 40'. Check your schedules/ init values please. Also make sure you don't have a backwards slash in any of your PATHs - use / instead of . Full error message is in your terminal/ cli. Time taken: 1m 0.47sTorch active/reserved: 5885/5918 MiB, Sys VRAM: 7407/16151 MiB (45.86%)

andrewssdd commented 1 year ago

I cannot reproduce this error. Are you using the latest notebook? (Updated 7/12). If so, please list the steps to reproduce this error with a fresh AI_PICS directory.

quintendewilde commented 1 year ago

Should these be uncommented?

def installxformers():

!pip install -q https://github.com/camenduru/stable-diffusion-webui-colab/releases/download/0.0.16/xformers-0.0.16+814314d.d20230118-cp38-cp38-linux_x86_64.whl

%pip install --no-deps -q https://github.com/brian6091/xformers-wheels/releases/download/0.0.15.dev0%2B4c06c79/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl

%pip install -q xformers

quintendewilde commented 1 year ago

@ctawong I am using the latest.

I cannot reproduce this error. Are you using the latest notebook? (Updated 7/12). If so, please list the steps to reproduce this error with a fresh AI_PICS directory.

I reinstalled it in new folder. I see this error message.

WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
    PyTorch 2.0.1+cu118 with CUDA 1108 (you have 1.13.1+cu117)
    Python  3.10.11 (you have 3.10.12)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details
ngrok authtoken detected, trying to connect...

I get the same issue after clean install

Full issue

*START OF TRACEBACK*
Traceback (most recent call last):
  File "/content/stable-diffusion-webui/extensions/deforum/scripts/deforum_helpers/run_deforum.py", line 78, in run_deforum
    render_animation(args, anim_args, video_args, parseq_args, loop_args, controlnet_args, root)
  File "/content/stable-diffusion-webui/extensions/deforum/scripts/deforum_helpers/render.py", line 548, in render_animation
    image = generate(args, keys, anim_args, loop_args, controlnet_args, root, parseq_adapter, frame_idx, sampler_name=scheduled_sampler_name)
  File "/content/stable-diffusion-webui/extensions/deforum/scripts/deforum_helpers/generate.py", line 56, in generate
    return generate_inner(args, keys, anim_args, loop_args, controlnet_args, root, parseq_adapter, frame, sampler_name)
  File "/content/stable-diffusion-webui/extensions/deforum/scripts/deforum_helpers/generate.py", line 210, in generate_inner
    processed = processing.process_images(p_txt)
  File "/content/stable-diffusion-webui/modules/processing.py", line 620, in process_images
    res = process_images_inner(p)
  File "/content/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/batch_hijack.py", line 42, in processing_process_images_hijack
    return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
  File "/content/stable-diffusion-webui/modules/processing.py", line 739, in process_images_inner
    samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
  File "/content/stable-diffusion-webui/modules/processing.py", line 992, in sample
    samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
  File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 439, in sample
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 278, in launch_sampling
    return func()
  File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 439, in <lambda>
    samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
  File "/usr/local/lib/python3.10/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/content/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 158, in forward
    x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict([cond_in], image_cond_in))
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "/content/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "/content/stable-diffusion-webui/modules/sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "/content/stable-diffusion-webui/modules/sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1335, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/stable-diffusion-webui/modules/sd_unet.py", line 91, in UNetModel_forward
    return ldm.modules.diffusionmodules.openaimodel.copy_of_UNetModel_forward_for_webui(self, x, timesteps, context, *args, **kwargs)
  File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward
    h = module(h, emb, context)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
    x = layer(x, context)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 334, in forward
    x = block(x, context=context[i])
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 269, in forward
    return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
  File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 121, in checkpoint
    return CheckpointFunction.apply(func, len(inputs), *args)
  File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 136, in forward
    output_tensors = ctx.run_function(*ctx.input_tensors)
  File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 272, in _forward
    x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 461, in xformers_attention_forward
    out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=get_xformers_flash_attention_op(q, k, v))
  File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/__init__.py", line 192, in memory_efficient_attention
    return _memory_efficient_attention(
  File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/__init__.py", line 290, in _memory_efficient_attention
    return _memory_efficient_attention_forward(
  File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/__init__.py", line 306, in _memory_efficient_attention_forward
    op = _dispatch_fw(inp)
  File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/dispatch.py", line 94, in _dispatch_fw
    return _run_priority_list(
  File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/dispatch.py", line 69, in _run_priority_list
    raise NotImplementedError(msg)
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(2, 4096, 8, 40) (torch.float16)
     key         : shape=(2, 4096, 8, 40) (torch.float16)
     value       : shape=(2, 4096, 8, 40) (torch.float16)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`flshattF` is not supported because:
    xFormers wasn't build with CUDA support
    Operator wasn't built - see `python -m xformers.info` for more info
    requires a GPU with compute capability > 7.5
`tritonflashattF` is not supported because:
    xFormers wasn't build with CUDA support
    requires A100 GPU
`cutlassF` is not supported because:
    xFormers wasn't build with CUDA support
    Operator wasn't built - see `python -m xformers.info` for more info
`smallkF` is not supported because:
    xFormers wasn't build with CUDA support
    dtype=torch.float16 (supported: {torch.float32})
    max(query.shape[-1] != value.shape[-1]) > 32
    Operator wasn't built - see `python -m xformers.info` for more info
    unsupported embed per head: 40
*END OF TRACEBACK*

User friendly error message:
Error: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(2, 4096, 8, 40) (torch.float16)
     key         : shape=(2, 4096, 8, 40) (torch.float16)
     value       : shape=(2, 4096, 8, 40) (torch.float16)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`flshattF` is not supported because:
    xFormers wasn't build with CUDA support
    Operator wasn't built - see `python -m xformers.info` for more info
    requires a GPU with compute capability > 7.5
`tritonflashattF` is not supported because:
    xFormers wasn't build with CUDA support
    requires A100 GPU
`cutlassF` is not supported because:
    xFormers wasn't build with CUDA support
    Operator wasn't built - see `python -m xformers.info` for more info
`smallkF` is not supported because:
    xFormers wasn't build with CUDA support
    dtype=torch.float16 (supported: {torch.float32})
    max(query.shape[-1] != value.shape[-1]) > 32
    Operator wasn't built - see `python -m xformers.info` for more info
quintendewilde commented 1 year ago

I've changed to a A100 GPU and get this error in console:

Error: 'No operator found for memory_efficient_attention_forward with inputs: query : shape=(2, 4096, 8, 40) (torch.float16) key : shape=(2, 4096, 8, 40) (torch.float16) value : shape=(2, 4096, 8, 40) (torch.float16) attn_bias : p : 0.0 flshattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info tritonflashattF is not supported because: xFormers wasn't build with CUDA support cutlassF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn't built - see python -m xformers.info for more info unsupported embed per head: 40'. Check your schedules/ init values please. Also make sure you don't have a backwards slash in any of your PATHs - use / instead of . Full error message is in your terminal/ cli. Time taken: 0.68sTorch active/reserved: 2117/2130 MiB, Sys VRAM: 3947/40514 MiB (9.74%)

More info: python -m xformers.info WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.0.1+cu118 with CUDA 1108 (you have 1.13.1+cu117) Python 3.10.11 (you have 3.10.6) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details xFormers 0.0.20 memory_efficient_attention.cutlassF: unavailable memory_efficient_attention.cutlassB: unavailable memory_efficient_attention.flshattF: unavailable memory_efficient_attention.flshattB: unavailable memory_efficient_attention.smallkF: unavailable memory_efficient_attention.smallkB: unavailable memory_efficient_attention.tritonflashattF: available memory_efficient_attention.tritonflashattB: available indexing.scaled_index_addF: unavailable indexing.scaled_index_addB: unavailable indexing.index_select: unavailable swiglu.dual_gemm_silu: unavailable swiglu.gemm_fused_operand_sum: unavailable swiglu.fused.p.cpp: not built is_triton_available: True is_functorch_available: False pytorch.version: 1.13.1+cu117 pytorch.cuda: available gpu.compute_capability: 8.0 gpu.name: NVIDIA A100-SXM4-40GB build.info: available build.cuda_version: 1108 build.python_version: 3.10.11 build.torch_version: 2.0.1+cu118 build.env.TORCH_CUDA_ARCH_LIST: 5.0+PTX 6.0 6.1 7.0 7.5 8.0 8.6 build.env.XFORMERS_BUILD_TYPE: Release build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None build.env.NVCC_FLAGS: None build.env.XFORMERS_PACKAGE_FROM: wheel-v0.0.20 source.privacy: open source

quintendewilde commented 1 year ago

Upgrading to latest torch fixed it.