[Bug]: Completely unable to train any LORA with CUDA out of memory error

daszzzpg commented 8 months ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

What happened?

I was trying to use A1111 dreambooth extension to train a SDXL model but failed (4070TI 12GB) Originally it is super slow, and I searched the internet and closed NVDIA's memory system fullback optin Then It shows a CUDA ot of memory error like that

However when I switched to a sd1.5 model it still gives me this!

Steps to reproduce the problem

Pick either a SD1.5 or SDXL model
Create
Train
Error

Commit and libraries

Starting at Initializing Dreambooth and ending several lines below at [+] bitsandbytes version 0.35.4 installed..

Command Line Arguments

set COMMANDLINE_ARGS=--no-gradio-queue --no-half-vae --xformers --medvram

Console logs

OM Detected, reducing batch/grad size to 0/2.█████████████▊         | 4/5 [00:00<00:00,  5.44it/s]
Traceback (most recent call last):
  File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\memory.py", line 126, in decorator
    return function(batch_size, grad_size, prof, *args, **kwargs)
  File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 477, in inner_loop
    unet.to(accelerator.device, dtype=weight_dtype)
  File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1145, in to
    return self._apply(convert)
  File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 820, in _apply
    param_applied = fn(param)
  File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1143, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 11.99 GiB total capacity; 10.95 GiB already allocated; 0 bytes free; 11.15 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Loading unet...:  80%|████████████████████████████████████▊         | 4/5 [00:02<00:00,  1.82it/s]
Traceback (most recent call last):
  File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\ui_functions.py", line 735, in start_training
    result = main(class_gen_method=class_gen_method)
  File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 1976, in main
    return inner_loop()
  File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\memory.py", line 124, in decorator
    raise RuntimeError("No executable batch size found, reached zero.")
RuntimeError: No executable batch size found, reached zero.

Additional information

No response

d8ahazard commented 8 months ago

Yeah, there's really not a lot I can do about running out of vram.

On Wed, Feb 7, 2024, 8:32 AM daszzzpg @.***> wrote:

Is there an existing issue for this?

I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

What happened?

I was trying to use A1111 dreambooth extension to train a SDXL model but failed (4070TI 12GB) Originally it is super slow, and I searched the internet and closed NVDIA's memory system fullback optin Then It shows a CUDA ot of memory error like that

However when I switched to a sd1.5 model it still gives me this! Steps to reproduce the problem

Pick either a SD1.5 or SDXL model

Create

Train

Error

Commit and libraries

Starting at Initializing Dreambooth and ending several lines below at [+] bitsandbytes version 0.35.4 installed.. Command Line Arguments

set COMMANDLINE_ARGS=--no-gradio-queue --no-half-vae --xformers --medvram

Console logs

OM Detected, reducing batch/grad size to 0/2.█████████████▊ | 4/5 [00:00<00:00, 5.44it/s] Traceback (most recent call last): File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\memory.py", line 126, in decorator return function(batch_size, grad_size, prof, *args, **kwargs) File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 477, in inner_loop unet.to(accelerator.device, dtype=weight_dtype) File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1145, in to return self._apply(convert) File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply module._apply(fn) File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply module._apply(fn) File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply module._apply(fn) [Previous line repeated 2 more times] File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 820, in _apply param_applied = fn(param) File "G:\AI\SDNEW\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1143, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 11.99 GiB total capacity; 10.95 GiB already allocated; 0 bytes free; 11.15 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Loading unet...: 80%|████████████████████████████████████▊ | 4/5 [00:02<00:00, 1.82it/s] Traceback (most recent call last): File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\ui_functions.py", line 735, in start_training result = main(class_gen_method=class_gen_method) File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 1976, in main return inner_loop() File "G:\AI\SDNEW\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\memory.py", line 124, in decorator raise RuntimeError("No executable batch size found, reached zero.") RuntimeError: No executable batch size found, reached zero.

Additional information

No response

— Reply to this email directly, view it on GitHub https://github.com/d8ahazard/sd_dreambooth_extension/issues/1457, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMO4NDDTR7TMWM7PFLKA2DYSOGA7AVCNFSM6AAAAABC56AARSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZDGMJXGIYDMOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

github-actions[bot] commented 7 months ago

This issue is stale because it has been open for 14 days with no activity. Remove stale label or comment or this will be closed in 30 days

d8ahazard / sd_dreambooth_extension