RuntimeError: CUDA out of memory. RTX4090

Describe the bug

I have this error, i have RTX 4090. ubuntu Ubuntu 22.04.1

RuntimeError: CUDA out of memory. Tried to allocate 3.20 GiB (GPU 0; 22.20 GiB total capacity; 18.38 GiB already allocated; 2.83 GiB free; 18.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Command:

accelerate launch train_dreambooth.py --not_cache_latents \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="beautiful girl in xxlichxx style" \
  --class_prompt="artstyle" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=2 --gradient_checkpointing \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=1 \
  --max_train_steps=4500

Reproduction

No response

Logs

No response

System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

diffusers version: 0.9.0
Platform: Linux-5.15.0-56-generic-x86_64-with-glibc2.35
Python version: 3.10.6
PyTorch version (GPU?): 1.12.1+cu116 (True)
Huggingface_hub version: 0.11.1
Transformers version: 4.25.1
Using GPU in script?: RTX4090
Using distributed or parallel set-up in script?: no

ShivamShrirao / diffusers