Engineer-of-Stuff / stable-diffusion-paperspace

Jupyter notebooks for Paperspace.
The Unlicense
288 stars 110 forks source link

xFormers can't load C++/CUDA extensions #85

Closed MeterPreter57 closed 1 year ago

MeterPreter57 commented 1 year ago

Hi, I have an issue with xFormers. I built xFormers using cell in the tools section.

Created wheel for xformers: filename=xformers-0.0.18+da27862.d20230331-cp39-cp39-linux_x86_64.whl size=138469800 sha256=38fbe8e437293c6db2bb02a7e7652437fffb6b8fd10afdd8e131e023fecacbae
  Stored in directory: /tmp/pip-ephem-wheel-cache-il4thzwk/wheels/3c/80/98/9d4dcd809f9ada257446b260db163d7ff8e1d92d500baae741
Successfully built xformers
Finished!
Moving .whl to /notebooks/
Here is your wheel file:
/notebooks/xformers-0.0.18+da27862.d20230331-cp39-cp39-linux_x86_64.whl
Installing your new Xformers wheel...
Processing /tmp/tmp.euqA6lL653/xformers-0.0.18+da27862.d20230331-cp39-cp39-linux_x86_64.whl
Collecting pyre-extensions==0.0.23
  Downloading pyre_extensions-0.0.23-py3-none-any.whl (11 kB)
Requirement already satisfied: torch>=1.12 in /usr/local/lib/python3.9/dist-packages (from xformers==0.0.18+da27862.d20230331) (1.12.0+cu116)
Requirement already satisfied: numpy in /usr/local/lib/python3.9/dist-packages (from xformers==0.0.18+da27862.d20230331) (1.23.1)
Collecting typing-inspect
  Downloading typing_inspect-0.8.0-py3-none-any.whl (8.7 kB)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.9/dist-packages (from pyre-extensions==0.0.23->xformers==0.0.18+da27862.d20230331) (4.3.0)
Collecting mypy-extensions>=0.3.0
  Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Installing collected packages: mypy-extensions, typing-inspect, pyre-extensions, xformers
Successfully installed mypy-extensions-1.0.0 pyre-extensions-0.0.23 typing-inspect-0.8.0 xformers-0.0.18+da27862.d20230331

Then I reran "Install requirements and download repositories" cell. Then I launched the WebUI and got this

/storage/stable-diffusion/stable-diffusion-webui
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
    PyTorch 1.12.0+cu116 with CUDA 1102 (you have 1.13.1+cu117)
    Python  3.9.13 (you have 3.9.13)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details

Now When I try to generate image I get this

Error completing request
Arguments: ('task(jgl25uwqwavjzon)', 'test prompt', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, True, False, 'none', 'None', 1, None, False, 'Scale to Fit (Inner Fit)', False, False, 64, 64, 64, 0, 1, False, False, 'none', 'None', 1, None, False, 'Scale to Fit (Inner Fit)', False, False, 64, 64, 64, 0, 1, False, False, 'none', 'None', 1, None, False, 'Scale to Fit (Inner Fit)', False, False, 64, 64, 64, 0, 1, False, False, '1:1,1:2,1:2', '0:0,0:0,0:1', '0.2,0.8,0.8', 150, 0.2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0, None, None, None, 50) {}
Traceback (most recent call last):
  File "/storage/stable-diffusion/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/storage/stable-diffusion/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/storage/stable-diffusion/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/storage/stable-diffusion/stable-diffusion-webui/modules/processing.py", line 486, in process_images
    res = process_images_inner(p)
  File "/storage/stable-diffusion/stable-diffusion-webui/modules/processing.py", line 638, in process_images_inner
    x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
  File "/storage/stable-diffusion/stable-diffusion-webui/modules/processing.py", line 638, in <listcomp>
    x_samples_ddim = [decode_first_stage(p.sd_model, samples_ddim[i:i+1].to(dtype=devices.dtype_vae))[0].cpu() for i in range(samples_ddim.size(0))]
  File "/storage/stable-diffusion/stable-diffusion-webui/modules/processing.py", line 423, in decode_first_stage
    x = model.decode_first_stage(x)
  File "/storage/stable-diffusion/stable-diffusion-webui/modules/sd_hijack_utils.py", line 17, in <lambda>
    setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
  File "/storage/stable-diffusion/stable-diffusion-webui/modules/sd_hijack_utils.py", line 28, in __call__
    return self.__orig_func(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/storage/stable-diffusion/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 826, in decode_first_stage
    return self.first_stage_model.decode(z)
  File "/storage/stable-diffusion/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/autoencoder.py", line 90, in decode
    dec = self.decoder(z)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/storage/stable-diffusion/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/model.py", line 631, in forward
    h = self.mid.attn_1(h)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/storage/stable-diffusion/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/model.py", line 258, in forward
    out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=self.attention_op)
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/fmha/__init__.py", line 196, in memory_efficient_attention
    return _memory_efficient_attention(
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/fmha/__init__.py", line 294, in _memory_efficient_attention
    return _memory_efficient_attention_forward(
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/fmha/__init__.py", line 310, in _memory_efficient_attention_forward
    op = _dispatch_fw(inp)
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/fmha/dispatch.py", line 98, in _dispatch_fw
    return _run_priority_list(
  File "/usr/local/lib/python3.9/dist-packages/xformers/ops/fmha/dispatch.py", line 73, in _run_priority_list
    raise NotImplementedError(msg)
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(1, 4096, 1, 512) (torch.float16)
     key         : shape=(1, 4096, 1, 512) (torch.float16)
     value       : shape=(1, 4096, 1, 512) (torch.float16)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`cutlassF` is not supported because:
    xFormers wasn't build with CUDA support
    Operator wasn't built - see `python -m xformers.info` for more info
`flshattF` is not supported because:
    xFormers wasn't build with CUDA support
    max(query.shape[-1] != value.shape[-1]) > 128
    Operator wasn't built - see `python -m xformers.info` for more info
    requires a GPU with compute capability > 7.5
`tritonflashattF` is not supported because:
    xFormers wasn't build with CUDA support
    max(query.shape[-1] != value.shape[-1]) > 128
    Operator wasn't built - see `python -m xformers.info` for more info
    triton is not available
    requires A100 GPU
`smallkF` is not supported because:
    xFormers wasn't build with CUDA support
    dtype=torch.float16 (supported: {torch.float32})
    max(query.shape[-1] != value.shape[-1]) > 32
    Operator wasn't built - see `python -m xformers.info` for more info
    unsupported embed per head: 512

python: 3.9.13 torch: 1.13.1+cu117 xformers: 0.0.18+da27862.d20230331 gradio: 3.16.2 commit: a9fed7c3 checkpoint: 7f16bbcd80

Cyberes commented 1 year ago

Can you make sure your notebook is up to date? I know the xformers builder used to build the minimized version without the forward attention operators.

Otherwise, something funky might be happening with torch versions: PyTorch 1.12.0+cu116 with CUDA 1102 (you have 1.13.1+cu117)

personxyz commented 1 year ago

I'm having the same problem although my error says:

WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.0..0+cu118 with CUDA 1108 (you have 1.13.1+cu117)

I tried updating the launch.py file to:

torch_command = os.environ.get('TORCH_COMMAND', "pip install torch==2.0.0+cu118 torchvision==0.14.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118")

I also set the requirements_versions.txt to torch==2.0.0

That didn't work, just created a bunch of errors when I ran the notebook so I changed it back. I also tried reinstalling the stable-diffusion-webui folder to no avail.

My notebook is up to date. Not a huge deal but hopefully we someone can get to the bottom of this.

MeterPreter57 commented 1 year ago

I solved the issue. The problem was that I did not run the "Install requirements and download repositories" block before building xFormers. I updated the notebook, reset storage and built xFormers again. Old errors are gone but I got a new one

TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

I downgraded the protobuf package to 3.20.3 Now everything works without any errors but I think that xFormers doesn't work because it takes the same amount of time to generate image as before when I didn't install it.

Cyberes commented 1 year ago

I had that error, but I don't remember how I fixed it. I might have re-built xformers...

personxyz commented 1 year ago

I solved the issue. The problem was that I did not run the "Install requirements and download repositories" block before building xFormers. I updated the notebook, reset storage and built xFormers again. Old errors are gone but I got a new one

TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

I downgraded the protobuf package to 3.20.3 Now everything works without any errors but I think that xFormers doesn't work because it takes the same amount of time to generate image as before when I didn't install it.

I basically went through the exact steps you described here, now everything works so thanks. Weird that xFormers isn't giving you any speed benefits though, seems to be working fine for me.

MeterPreter57 commented 1 year ago

@personxyz what gpu are you using?

personxyz commented 1 year ago

@personxyz what gpu are you using?

A6000.

MeterPreter57 commented 1 year ago

I rebuilt xFormers again and got the same result. It seems that Quadro M4000 is too old to benefit from the xFormers

ilovefree999 commented 1 year ago

WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.0..0+cu118 with CUDA 1108 (you have 1.13.1+cu117) 遇到这个问题我的解决方法是: (you have 1.13.1+cu117)这是你的系统内的版本, 重新安装 xFormers 可以按照以下步骤进行: 首先,打开命令行界面。 使用以下命令卸载当前已安装的 xFormers(如果已安装): x:\xxx\xxx\python(你的SD目录内的python文件夹),在上面键入CMD 然后在控制台输入以下命令 pip uninstall xformers 完成后打开启动器(我是秋叶的绘世) 高级选项 点进去后跟一键启动并排 有个环境维护,配置pyTorch点开,选择版本,与你的(you have 1.13.1+cu117)这一项版本一致的选择完点击安装即可解决.