Can't finetune stable diffusion with --enable_xformers_memory_efficient_attention

LucasSloan commented 1 year ago

Describe the bug

I'm trying to finetune stable diffusion, and I'm trying to reduce the memory footprint so I can train with a larger batch size (and thus fewer gradient accumulation steps, and thus faster).

Setting --enable_xformers_memory_efficient_attention results in numeric instability of some kind, I think? The safety_checker tripped (training on the Pokemon dataset, validation prompt "Yoda"). If I disable the safety_checker, and I get black images anyway, along with the error message:

/home/lucas/.local/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py:813: RuntimeWarning: invalid value encountered in cast
  images = (images * 255).round().astype("uint8")

If I instead set --enable_xformers_memory_efficient_attention, but disable --gradient_checkpointing, everything hums along nicely, but the model doesn't actually fine tune.

I attempted to force xformers to use Flash Attention (using the snippet in https://github.com/huggingface/diffusers/pull/2049), because https://github.com/huggingface/diffusers/issues/1997 suggested there were issues with the other xformers attention kernels, I get this error:

ValueError: Operator `memory_efficient_attention` does not support inputs:
     query       : shape=(8, 256, 1, 160) (torch.float16)
     key         : shape=(8, 256, 1, 160) (torch.float16)
     value       : shape=(8, 256, 1, 160) (torch.float16)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`flshattF` is not supported because:
    max(query.shape[-1] != value.shape[-1]) > 128

Reproduction

Here's the command I ran with --enable_xformers_memory_efficient_attention, but not with --gradient_checkpointing:

accelerate launch train_text_to_image.py   --pretrained_model_name_or_path=$MODEL_NAME   --dataset_name=$dataset_name   --use_ema   --resolution=512 --center_crop --random_flip   --train_batch_size=1   --gradient_accumulation_steps=8   --mixed_precision="fp16"   --max_train_steps=15000   --learning_rate=1e-05   --max_grad_norm=1   --lr_scheduler="constant" --lr_warmup_steps=0   --output_dir="sd-pokemon-model"  --validation_prompt=Yoda --num_validation_images=8  --validation_steps=1000  --enable_xformers_memory_efficient_attention

I'm running with https://github.com/huggingface/diffusers/pull/2157, because that gives me images to see how training is progressing (which is how I noticed it wasn't finetuning), but I've observed it at HEAD.

Logs

No response

System Info

diffusers version: 0.13.0.dev0
Platform: Linux-5.15.79.1-microsoft-standard-WSL2-x86_64-with-glibc2.29
Python version: 3.8.10
PyTorch version (GPU?): 1.13.1+cu117 (True)
Huggingface_hub version: 0.11.1
Transformers version: 0.15.0
Accelerate version: using accelerate, but configured to run on a single gpu
xFormers version: 0.0.16
Using GPU in script?: 3090

EandrewJones commented 1 year ago

@LucasSloan There's a good chance your issues are related to a problem in xformers v0.0.16 where the Stable Diffusion attention head dims are too large on certain GPU architectures (sm86/89):

https://github.com/facebookresearch/xformers/issues/631

Try updating to a newer xformers dev release that includes the patch from that issue: pip install xformers==0.0.17.dev435 pip install xformers==0.0.17.dev441 pip install xformers==0.0.17.dev442

If that doesn't work, would you mind sharing the output from python -m xformers.info?

LucasSloan commented 1 year ago

That fixed it, thanks!

Dragonswords102 commented 1 year ago

Hey, I stumbled upon your response while trying to fix my own issues with xformer 0.0.16, however all of the dev options that you suggested provided errors

(base) C:\Users\orins\OneDrive\Documents\SDlocal>pip install xformers==0.0.17.dev441 ERROR: Could not find a version that satisfies the requirement xformers==0.0.17.dev441 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.16rc424, 0.0.16rc425, 0.0.16, 0.0.17.dev447, 0.0.17.dev448, 0.0.17.dev449, 0.0.17.dev451, 0.0.17.dev461) ERROR: No matching distribution found for xformers==0.0.17.dev441

Since I assume it will be helpful, I will also provide the python -m xformers.info

(base) C:\Users\orins\OneDrive\Documents\SDlocal>python -m xformers.info Traceback (most recent call last): File "C:\Users\orins\miniconda3\lib\runpy.py", line 187, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "C:\Users\orins\miniconda3\lib\runpy.py", line 110, in _get_module_details import(pkg_name) File "C:\Users\orins\miniconda3\lib\site-packages\xformers__init__.py", line 10, in from . import _cpp_lib File "C:\Users\orins\miniconda3\lib\site-packages\xformers_cpp_lib.py", line 127, in _build_metadata = _register_extensions() File "C:\Users\orins\miniconda3\lib\site-packages\xformers_cpp_lib.py", line 117, in _register_extensions torch.ops.load_library(ext_specs.origin) AttributeError: module 'torch' has no attribute 'ops'

Hope this is enough info, thanks

EandrewJones commented 1 year ago

Hi,

TL;DR If you read the error from the attempted install, you'll see xformers version 0.0.17.dev441 is no longer available on PyPi. Instead, try installing one of the newer dev releases which should include the fix:

0.0.17.dev447, 0.0.17.dev448, 0.0.17.dev449, 0.0.17.dev451, 0.0.17.dev461

See for PyPi for all releases: https://pypi.org/project/xformers/#history

You may wonder: why doesn't the version I posted exist anymore? Answer: All libraries have limited space available to them on PyPi to host different versions. They keep stable versions pinned, but as new development releases of the upcoming version are made available, they have to deprecate older minor dev releases to stay within their quota.

Best

Evan Jones Website: www.ea-jones.com

On Mon, Feb 20, 2023 at 10:03 PM Dragonswords102 @.***> wrote:

Hey, I stumbled upon your response while trying to fix my own issues with xformer 0.0.16, however all of the dev options that you suggested provided errors

(base) C:\Users\orins\OneDrive\Documents\SDlocal>pip install xformers==0.0.17.dev441 ERROR: Could not find a version that satisfies the requirement xformers==0.0.17.dev441 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.16rc424, 0.0.16rc425, 0.0.16, 0.0.17.dev447, 0.0.17.dev448, 0.0.17.dev449, 0.0.17.dev451, 0.0.17.dev461) ERROR: No matching distribution found for xformers==0.0.17.dev441

Since I assume it will be helpful, I will also provide the python -m xformers.info

(base) C:\Users\orins\OneDrive\Documents\SDlocal>python -m xformers.info Traceback (most recent call last): File "C:\Users\orins\miniconda3\lib\runpy.py", line 187, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "C:\Users\orins\miniconda3\lib\runpy.py", line 110, in

get_module_details import(pkg_name) File "C:\Users\orins\miniconda3\lib\site-packages\xformers_init.py", line 10, in from . import _cpp_lib File "C:\Users\orins\miniconda3\lib\site-packages\xformers_cpp_lib.py", line 127, in _build_metadata = _register_extensions() File "C:\Users\orins\miniconda3\lib\site-packages\xformers_cpp_lib.py", line 117, in _register_extensions torch.ops.load_library(ext_specs.origin) AttributeError: module 'torch' has no attribute 'ops'

Hope this is enough info, thanks

— Reply to this email directly, view it on GitHub https://github.com/huggingface/diffusers/issues/2234#issuecomment-1437795080, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJ2T6AN3RZY7PVOUMFQOIQDWYQWANANCNFSM6AAAAAAUQT2EKM . You are receiving this because you commented.Message ID: @.***>

Dragonswords102 commented 1 year ago

Hi, that seemed to fix my issue, thank you. While we are here I have another issue that maybe you have knowledge on. I have created a hypernetwork and followed a GitHub guide on training settings, however when I press train hypernetwork, the command prompt tells me that cuda is out of memory, which does not make much sense to me as I have plenty of space. I have 16gbs of RAM and about 128MB of VRAM false error

EandrewJones commented 1 year ago

It appears you actually only have 6GB of VRAM on your GPU which is probably too limited for training most image models unless you run an extremely optimized algorithm.

Anyways, if you have questions about Auto1111 webUI, I would take your questions over there.

Best

Evan Jones Website: www.ea-jones.com

On Fri, Feb 24, 2023 at 3:51 AM Dragonswords102 @.***> wrote:

Hi, that seemed to fix my issue, thank you. While we are here I have another issue that maybe you have knowledge on. I have created a hypernetwork and followed a GitHub guide on training settings, however when I press train hypernetwork, the command prompt tells me that cuda is out of memory, which does not make much sense to me as I have plenty of space. I have 16gbs of RAM and about 128MB of VRAM [image: false error] https://user-images.githubusercontent.com/125940602/221134518-52d4df32-0033-4ff9-9e67-62bb62354f9d.png

— Reply to this email directly, view it on GitHub https://github.com/huggingface/diffusers/issues/2234#issuecomment-1443192923, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJ2T6APGTLGZH76NZSJNKWDWZBZCFANCNFSM6AAAAAAUQT2EKM . You are receiving this because you commented.Message ID: @.***>

technologiespro commented 1 year ago

issue

ERROR: Could not find a version that satisfies the requirement xformers==0.0.17.dev441 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.16rc424, 0.0.16rc425, 0.0.16, 0.0.17.dev466, 0.0.17.dev473, 0.0.17.dev474, 0.0.17.dev476, 0.0.17.dev480, 0.0.17.dev481)

pip install xformers==0.0.17.dev481

patrickvonplaten commented 1 year ago

Hey @technologiespro ,

This looks like a problem with xformers: https://github.com/facebookresearch/xformers - could you please post the issue there?

kime541200 commented 1 year ago

Hi everyone, I encountered the issue while training my Dreambooth model and found a solution that may be helpful to you. In the setup.bat file, modify the packages to be installed to:

pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install pyre-extensions==0.0.23
pip install --use-pep517 --upgrade -r requirements.txt
pip install xformers==0.0.16

After making these changes, you should be able to start training your Dreambooth model.

For your information, I am using Windows10 as my operating system and a 3060 GPU.

Additionally, I came across some information at (https://huggingface.co/docs/diffusers/optimization/xformers) that suggests xFormers v0.0.16 may not be suitable for training (fine-tune or Dreambooth) on certain GPUs. If you encounter any issues, please refer to the comment on that page and install the recommended development version to test whether it resolves the problem for you all.

liuchenbaidu commented 1 year ago

I encountered the issue while running
code "txt2img_pipe.enable_xformers_memory_efficient_attention(attention_op=MemoryEfficientAttentionFlashAttentionOp)" with using stable diffusion 1.6 model.

python -m xformers.info: xFormers 0.0.16 memory_efficient_attention.cutlassF: available memory_efficient_attention.cutlassB: available memory_efficient_attention.flshattF: available memory_efficient_attention.flshattB: available memory_efficient_attention.smallkF: available memory_efficient_attention.smallkB: available memory_efficient_attention.tritonflashattF: available memory_efficient_attention.tritonflashattB: available swiglu.fused.p.cpp: available is_triton_available: True is_functorch_available: False pytorch.version: 1.13.1+cu117 pytorch.cuda: available gpu.compute_capability: 8.6 gpu.name: NVIDIA GeForce RTX 3090 build.info: available build.cuda_version: 1107 build.python_version: 3.8.16 build.torch_version: 1.13.1+cu117 build.env.TORCH_CUDA_ARCH_LIST: 5.0+PTX 6.0 6.1 7.0 7.5 8.0 8.6 build.env.XFORMERS_BUILD_TYPE: Release build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None build.env.NVCC_FLAGS: None build.env.XFORMERS_PACKAGE_FROM: wheel-v0.0.16 source.privacy: open source

liuxz-cs commented 1 year ago

That fixed it, thanks!

How do you solve this problem, which version of xformers you have installed? I try 0.0.17. 0.0.17rc481. 0.0.17rc482, but cannot solve this problem.

liuxz-cs commented 1 year ago

issue

ERROR: Could not find a version that satisfies the requirement xformers==0.0.17.dev441 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11, 0.0.12, 0.0.13, 0.0.16rc424, 0.0.16rc425, 0.0.16, 0.0.17.dev466, 0.0.17.dev473, 0.0.17.dev474, 0.0.17.dev476, 0.0.17.dev480, 0.0.17.dev481)

pip install xformers==0.0.17.dev481

Do you solve this problem finally?

qingcong1224 commented 6 months ago

hi ,I have encountered the following problem，torch==2.3.0+cu118 and xformers==0.0.26 post1+cu118

return self._call_impl(*args, *kwargs) File "C:\Users\Admin\anaconda3\envs\svd\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, **kwargs) File "C:\Users\Admin\anaconda3\envs\svd\lib\site-packages\sgm\modules\diffusionmodules\model.py", line 263, in forward h = self.attention(h) File "C:\Users\Admin\anaconda3\envs\svd\lib\site-packages\sgm\modules\diffusionmodules\model.py", line 249, in attention out = xformers.ops.memory_efficient_attention( AttributeError: module 'xformers' has no attribute 'ops'

huggingface / diffusers