Closed quintendewilde closed 1 year ago
I cannot reproduce this error. Are you using the latest notebook? (Updated 7/12). If so, please list the steps to reproduce this error with a fresh AI_PICS directory.
Should these be uncommented?
def installxformers():
%pip install -q xformers
@ctawong I am using the latest.
I cannot reproduce this error. Are you using the latest notebook? (Updated 7/12). If so, please list the steps to reproduce this error with a fresh AI_PICS directory.
I reinstalled it in new folder. I see this error message.
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.1+cu118 with CUDA 1108 (you have 1.13.1+cu117)
Python 3.10.11 (you have 3.10.12)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
ngrok authtoken detected, trying to connect...
I get the same issue after clean install
Full issue
*START OF TRACEBACK*
Traceback (most recent call last):
File "/content/stable-diffusion-webui/extensions/deforum/scripts/deforum_helpers/run_deforum.py", line 78, in run_deforum
render_animation(args, anim_args, video_args, parseq_args, loop_args, controlnet_args, root)
File "/content/stable-diffusion-webui/extensions/deforum/scripts/deforum_helpers/render.py", line 548, in render_animation
image = generate(args, keys, anim_args, loop_args, controlnet_args, root, parseq_adapter, frame_idx, sampler_name=scheduled_sampler_name)
File "/content/stable-diffusion-webui/extensions/deforum/scripts/deforum_helpers/generate.py", line 56, in generate
return generate_inner(args, keys, anim_args, loop_args, controlnet_args, root, parseq_adapter, frame, sampler_name)
File "/content/stable-diffusion-webui/extensions/deforum/scripts/deforum_helpers/generate.py", line 210, in generate_inner
processed = processing.process_images(p_txt)
File "/content/stable-diffusion-webui/modules/processing.py", line 620, in process_images
res = process_images_inner(p)
File "/content/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/batch_hijack.py", line 42, in processing_process_images_hijack
return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
File "/content/stable-diffusion-webui/modules/processing.py", line 739, in process_images_inner
samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
File "/content/stable-diffusion-webui/modules/processing.py", line 992, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 439, in sample
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 278, in launch_sampling
return func()
File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 439, in <lambda>
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
File "/usr/local/lib/python3.10/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/content/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 158, in forward
x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict([cond_in], image_cond_in))
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward
eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
File "/content/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps
return self.inner_model.apply_model(*args, **kwargs)
File "/content/stable-diffusion-webui/modules/sd_hijack_utils.py", line 17, in <lambda>
setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
File "/content/stable-diffusion-webui/modules/sd_hijack_utils.py", line 28, in __call__
return self.__orig_func(*args, **kwargs)
File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model
x_recon = self.model(x_noisy, t, **cond)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1335, in forward
out = self.diffusion_model(x, t, context=cc)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/stable-diffusion-webui/modules/sd_unet.py", line 91, in UNetModel_forward
return ldm.modules.diffusionmodules.openaimodel.copy_of_UNetModel_forward_for_webui(self, x, timesteps, context, *args, **kwargs)
File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward
h = module(h, emb, context)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward
x = layer(x, context)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 334, in forward
x = block(x, context=context[i])
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 269, in forward
return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 121, in checkpoint
return CheckpointFunction.apply(func, len(inputs), *args)
File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 136, in forward
output_tensors = ctx.run_function(*ctx.input_tensors)
File "/content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 272, in _forward
x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/stable-diffusion-webui/modules/sd_hijack_optimizations.py", line 461, in xformers_attention_forward
out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=get_xformers_flash_attention_op(q, k, v))
File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/__init__.py", line 192, in memory_efficient_attention
return _memory_efficient_attention(
File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/__init__.py", line 290, in _memory_efficient_attention
return _memory_efficient_attention_forward(
File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/__init__.py", line 306, in _memory_efficient_attention_forward
op = _dispatch_fw(inp)
File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/dispatch.py", line 94, in _dispatch_fw
return _run_priority_list(
File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/dispatch.py", line 69, in _run_priority_list
raise NotImplementedError(msg)
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
query : shape=(2, 4096, 8, 40) (torch.float16)
key : shape=(2, 4096, 8, 40) (torch.float16)
value : shape=(2, 4096, 8, 40) (torch.float16)
attn_bias : <class 'NoneType'>
p : 0.0
`flshattF` is not supported because:
xFormers wasn't build with CUDA support
Operator wasn't built - see `python -m xformers.info` for more info
requires a GPU with compute capability > 7.5
`tritonflashattF` is not supported because:
xFormers wasn't build with CUDA support
requires A100 GPU
`cutlassF` is not supported because:
xFormers wasn't build with CUDA support
Operator wasn't built - see `python -m xformers.info` for more info
`smallkF` is not supported because:
xFormers wasn't build with CUDA support
dtype=torch.float16 (supported: {torch.float32})
max(query.shape[-1] != value.shape[-1]) > 32
Operator wasn't built - see `python -m xformers.info` for more info
unsupported embed per head: 40
*END OF TRACEBACK*
User friendly error message:
Error: No operator found for `memory_efficient_attention_forward` with inputs:
query : shape=(2, 4096, 8, 40) (torch.float16)
key : shape=(2, 4096, 8, 40) (torch.float16)
value : shape=(2, 4096, 8, 40) (torch.float16)
attn_bias : <class 'NoneType'>
p : 0.0
`flshattF` is not supported because:
xFormers wasn't build with CUDA support
Operator wasn't built - see `python -m xformers.info` for more info
requires a GPU with compute capability > 7.5
`tritonflashattF` is not supported because:
xFormers wasn't build with CUDA support
requires A100 GPU
`cutlassF` is not supported because:
xFormers wasn't build with CUDA support
Operator wasn't built - see `python -m xformers.info` for more info
`smallkF` is not supported because:
xFormers wasn't build with CUDA support
dtype=torch.float16 (supported: {torch.float32})
max(query.shape[-1] != value.shape[-1]) > 32
Operator wasn't built - see `python -m xformers.info` for more info
I've changed to a A100 GPU and get this error in console:
Error: 'No operator found for memory_efficient_attention_forward with inputs: query : shape=(2, 4096, 8, 40) (torch.float16) key : shape=(2, 4096, 8, 40) (torch.float16) value : shape=(2, 4096, 8, 40) (torch.float16) attn_bias : p : 0.0 flshattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info tritonflashattF is not supported because: xFormers wasn't build with CUDA support cutlassF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn't built - see python -m xformers.info for more info unsupported embed per head: 40'. Check your schedules/ init values please. Also make sure you don't have a backwards slash in any of your PATHs - use / instead of . Full error message is in your terminal/ cli. Time taken: 0.68sTorch active/reserved: 2117/2130 MiB, Sys VRAM: 3947/40514 MiB (9.74%)
More info: python -m xformers.info WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.0.1+cu118 with CUDA 1108 (you have 1.13.1+cu117) Python 3.10.11 (you have 3.10.6) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details xFormers 0.0.20 memory_efficient_attention.cutlassF: unavailable memory_efficient_attention.cutlassB: unavailable memory_efficient_attention.flshattF: unavailable memory_efficient_attention.flshattB: unavailable memory_efficient_attention.smallkF: unavailable memory_efficient_attention.smallkB: unavailable memory_efficient_attention.tritonflashattF: available memory_efficient_attention.tritonflashattB: available indexing.scaled_index_addF: unavailable indexing.scaled_index_addB: unavailable indexing.index_select: unavailable swiglu.dual_gemm_silu: unavailable swiglu.gemm_fused_operand_sum: unavailable swiglu.fused.p.cpp: not built is_triton_available: True is_functorch_available: False pytorch.version: 1.13.1+cu117 pytorch.cuda: available gpu.compute_capability: 8.0 gpu.name: NVIDIA A100-SXM4-40GB build.info: available build.cuda_version: 1108 build.python_version: 3.10.11 build.torch_version: 2.0.1+cu118 build.env.TORCH_CUDA_ARCH_LIST: 5.0+PTX 6.0 6.1 7.0 7.5 8.0 8.6 build.env.XFORMERS_BUILD_TYPE: Release build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None build.env.NVCC_FLAGS: None build.env.XFORMERS_PACKAGE_FROM: wheel-v0.0.20 source.privacy: open source
Upgrading to latest torch fixed it.
What's going wrong I'm runnign with A100 on pro Colab?
Error: 'No operator found for
memory_efficient_attention_forward
with inputs: query : shape=(1, 4096, 8, 40) (torch.float16) key : shape=(1, 4096, 8, 40) (torch.float16) value : shape=(1, 4096, 8, 40) (torch.float16) attn_bias : p : 0.0flshattF
is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - seepython -m xformers.info
for more info requires a GPU with compute capability > 7.5tritonflashattF
is not supported because: xFormers wasn't build with CUDA support requires A100 GPUcutlassF
is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - seepython -m xformers.info
for more infosmallkF
is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn't built - seepython -m xformers.info
for more info unsupported embed per head: 40'. Check your schedules/ init values please. Also make sure you don't have a backwards slash in any of your PATHs - use / instead of . Full error message is in your terminal/ cli. Time taken: 1m 0.47sTorch active/reserved: 5885/5918 MiB, Sys VRAM: 7407/16151 MiB (45.86%)