Open guivr opened 7 months ago
Are you working with PyTorch 2.1? This might be linked to an issue you can find here: https://github.com/huggingface/diffusers/issues/5484
Also, if it fails at the end, it's likely just a validation issue. Training is probably done (check your LoRA save path), if so, you can ignore it, or just remove validation_~
in arguments.
Are you working with PyTorch 2.1? This might be linked to an issue you can find here: huggingface/diffusers#5484
Also, if it fails at the end, it's likely just a validation issue. Training is probably done (check your LoRA save path), if so, you can ignore it, or just remove
validation_~
in arguments.
Yep, PyTorch 2.1. What version should it be?
Are you working with PyTorch 2.1? This might be linked to an issue you can find here: huggingface/diffusers#5484 Also, if it fails at the end, it's likely just a validation issue. Training is probably done (check your LoRA save path), if so, you can ignore it, or just remove
validation_~
in arguments.Yep, PyTorch 2.1. What version should it be?
Without enable_xformers_memory_efficient_attention flag training works fine.
Are you working with PyTorch 2.1? This might be linked to an issue you can find here: huggingface/diffusers#5484 Also, if it fails at the end, it's likely just a validation issue. Training is probably done (check your LoRA save path), if so, you can ignore it, or just remove
validation_~
in arguments.Yep, PyTorch 2.1. What version should it be?
Without enable_xformers_memory_efficient_attention flag training works fine.
Yes, if your GPU has >16 GB memory. On a Colab I was trying with V100 it was failing at the end. A100 was unavailable at that time. A few hours after A100 became available and then it worked.
If you're on PyTorch 2.1, it might be a problem to use it xformers (see: https://github.com/huggingface/diffusers/issues/5484). For that case, we default to SDPA (scaled dot-product attention) and should run on a Google Colab free-tier.
If xformers usage is a must, I would recommend using it with Torch 1.13.1.
Are you working with PyTorch 2.1? This might be linked to an issue you can find here: huggingface/diffusers#5484 Also, if it fails at the end, it's likely just a validation issue. Training is probably done (check your LoRA save path), if so, you can ignore it, or just remove
validation_~
in arguments.Yep, PyTorch 2.1. What version should it be?
Without enable_xformers_memory_efficient_attention flag training works fine.
Yes, if your GPU has >16 GB memory. On a Colab I was trying with V100 it was failing at the end. A100 was unavailable at that time. A few hours after A100 became available and then it worked.
When I was running train_dreambooth_ziplora_sdxl.py on 4090 (24G), I also running into the "CUDA out of memory".
Are you working with PyTorch 2.1? This might be linked to an issue you can find here: huggingface/diffusers#5484 Also, if it fails at the end, it's likely just a validation issue. Training is probably done (check your LoRA save path), if so, you can ignore it, or just remove
validation_~
in arguments.Yep, PyTorch 2.1. What version should it be?
Without enable_xformers_memory_efficient_attention flag training works fine.
Yes, if your GPU has >16 GB memory. On a Colab I was trying with V100 it was failing at the end. A100 was unavailable at that time. A few hours after A100 became available and then it worked.
When I was running train_dreambooth_ziplora_sdxl.py on 4090 (24G), I also running into the "CUDA out of memory".
See #8. You can free vram instantly. I successfully run it on 4090.
Are you working with PyTorch 2.1? This might be linked to an issue you can find here: huggingface/diffusers#5484 Also, if it fails at the end, it's likely just a validation issue. Training is probably done (check your LoRA save path), if so, you can ignore it, or just remove
validation_~
in arguments.Yep, PyTorch 2.1. What version should it be?
Without enable_xformers_memory_efficient_attention flag training works fine.
Yes, if your GPU has >16 GB memory. On a Colab I was trying with V100 it was failing at the end. A100 was unavailable at that time. A few hours after A100 became available and then it worked.
When I was running train_dreambooth_ziplora_sdxl.py on 4090 (24G), I also running into the "CUDA out of memory".
See #8. You can free vram instantly. I successfully run it on 4090.
Without enable_xformers_memory_efficient_attention flag and followinghttps://github.com/https://github.com/mkshing/ziplora-pytorch/pull/8, while I still run into the "CUDA out of memory" on 3090 (24G), sorry not 4090. My accelearte config compute_environment: LOCAL_MACHINE debug: false distributed_type: MULTI_GPU downcast_bf16: 'no' gpu_ids: 1,2 machine_rank: 0 main_training_function: main mixed_precision: fp16 num_machines: 1 num_processes: 2 rdzv_backend: static same_network: true tpu_env: [] tpu_use_cluster: false tpu_use_sudo: false use_cpu: false
i think xformers and torch used different version cuda to compile,caused this problem,recompile xformers from source code ,try it
Hi! Thank you so much for this.
I'm trying to run this on Google Colab but I'm always running into the "CUDA out of memory" error.
I've tried adding:
but it's breaking with:
error:
once I remove this argument it works, but always fails at the last step, like
1000/1000
because out of memory (16GB limit, V100)