TheLastBen / fast-stable-diffusion

fast-stable-diffusion + DreamBooth
MIT License
7.49k stars 1.3k forks source link

CUDA out of memory #2042

Open Zphyr00 opened 1 year ago

Zphyr00 commented 1 year ago

Two and a half days ago a CUDA out of memory error started simultaneously on two accounts with two different training with different pictures. And yes, I checked the resolution of each picture. The problem is only in training, Automatic handles even complex tasks without problems.

TheLastBen commented 1 year ago

what resolution are you training ?

Zphyr00 commented 1 year ago

768 on one and 1024 on the other, both 2.1 768

iqddd commented 1 year ago

Same problem. Using free colab. Training 640x640 on SD2.1-512px. Problem appears on text_encoder training stage. image

Progress:|                         |  0% 1/1915 [00:08<4:42:14,  8.85s/it, loss=0.0194, lr=6e-7] DmRs  Traceback (most recent call last):
  File "/content/diffusers/examples/dreambooth/train_dreambooth.py", line 803, in <module>
    main()
  File "/content/diffusers/examples/dreambooth/train_dreambooth.py", line 690, in main
    model_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/utils/operations.py", line 507, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "/usr/local/lib/python3.9/dist-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/diffusers/models/unet_2d_condition.py", line 632, in forward
    sample = upsample_block(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/diffusers/models/unet_2d_blocks.py", line 1805, in forward
    hidden_states = torch.utils.checkpoint.checkpoint(
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/checkpoint.py", line 249, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "/usr/local/lib/python3.9/dist-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/checkpoint.py", line 107, in forward
    outputs = run_function(*args)
  File "/usr/local/lib/python3.9/dist-packages/diffusers/models/unet_2d_blocks.py", line 1798, in custom_forward
    return module(*inputs, return_dict=return_dict)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/diffusers/models/transformer_2d.py", line 265, in forward
    hidden_states = block(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/diffusers/models/attention.py", line 324, in forward
    ff_output = self.ff(norm_hidden_states)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/diffusers/models/attention.py", line 382, in forward
    hidden_states = module(hidden_states)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/diffusers/models/attention.py", line 429, in forward
    return hidden_states * self.gelu(gate)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 14.75 GiB total capacity; 13.33 GiB already allocated; 6.81 MiB free; 13.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Progress:|                         |  0% 1/1915 [00:09<5:06:52,  9.62s/it, loss=0.0194, lr=6e-7]
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py", line 837, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '/content/diffusers/examples/dreambooth/train_dreambooth.py', '--train_only_text_encoder', '--image_captions_filename', '--train_text_encoder', '--dump_only_text_encoder', '--pretrained_model_name_or_path=/content/stable-diffusion-v2-512', '--instance_data_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/DemiRose_640_v21_fast/instance_images', '--output_dir=/content/models/DemiRose_640_v21_fast', '--captions_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/DemiRose_640_v21_fast/captions', '--instance_prompt=', '--seed=229081', '--resolution=640', '--mixed_precision=fp16', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--gradient_checkpointing', '--use_8bit_adam', '--learning_rate=6e-07', '--lr_scheduler=linear', '--lr_warmup_steps=0', '--max_train_steps=1915']' returned non-zero exit status 1.
Something went wrong
TheLastBen commented 1 year ago

fixed now, use the latest notebook