Closed AeroDEmi closed 10 months ago
Can you pass on your Google Colab?
Sure!, I only have three cells of code:
!pip install accelerate -q
!pip install torchvision -q
!pip install transformers -q
!pip install datasets -q
!pip install ftfy -q
!pip install tensorboard -q
!pip install Jinja2 -q
!pip install diffusers -q
!pip install peft -q
from accelerate.utils import write_basic_config
write_basic_config()
!accelerate launch /content/drive/MyDrive/Colab/SD_LoRA/train_text_to_image_sdxl.py \
--pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
--pretrained_vae_model_name_or_path="madebyollin/sdxl-vae-fp16-fix" \
--dataset_name="lambdalabs/pokemon-blip-captions" \
--resolution=1024 --random_flip \
--train_batch_size=1 \
--num_train_epochs=5 --checkpointing_steps=635 \
--learning_rate=1e-05 --lr_scheduler="constant" --lr_warmup_steps=100 \
--mixed_precision="fp16" \
--seed=42 \
--output_dir="/content/pokemon"
Seems that it hangs here: train_dataset = train_dataset.map(compute_embeddings_fn, batched=True, new_fingerprint=new_fingerprint)
in the script train_text_to_image_sdxl.py
Can you ensure the use of the latest diffusers
script cloned from main
?
Let me pull again from main
Still having the error
It works with the False
flag in train_dataset = train_dataset.map(compute_embeddings_fn, batched=False, new_fingerprint=new_fingerprint)
Now, it hangs around the 500th iteration
500th iteration of training?
In the caching of the dataset in the first mapping. But I was using a 24gbs GPU, let me try it with 40gbs
It seems it was the caching of the tokenizer and the vae. I ran out of RAM. I increased the RAM and the problem went away. I wonder if there's a way to not use ram and use the storage
@sayakpaul is it possible to not load everything into RAM if my dataset is really big?
Describe the bug
When using
train_text_to_image_sdxl.py
I'm getting out of memory when mapping the dataset:Reproduction
Logs
No response
System Info
I'm using Google Colab with the A100 40GBS RAM
Who can help?
@sayakpaul @patrickvonplaten