Adam 8-bit showing error and then running out of memory

ProgrammingDinosaur commented 1 year ago

` venv "D:\Programs\stable-diffusion-webui\venv\Scripts\Python.exe" Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] Commit hash: 98947d173e3f1667eba29c904f681047dea9de90 Installing requirements for Web UI Installing requirements for Dreambooth Checking Dreambooth requirements. Checking/upgrading existing torch/torchvision installation Checking torch and torchvision versions Dreambooth revision is f0e4061ab102a56fd5891e393967a0889b494936 Diffusers version is 0.7.2. Torch version is 1.12.1+cu116. Torch vision version is 0.13.1+cu116.

Launching Web UI with arguments: --xformers Dreambooth API layer loaded LatentInpaintDiffusion: Running in eps-prediction mode DiffusionWrapper has 859.54 M params. making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels Loading weights [3e16efc8] from D:\Programs\stable-diffusion-webui\models\Stable-diffusion\sd-v1-5-inpainting.ckpt Global Step: 440000 Applying xformers cross attention optimization. Model loaded. Loaded a total of 0 textual inversion embeddings. Embeddings: Running on local URL: http://127.0.0.1:7860 `

Have you read the Readme? Yes

Have you completely restarted the stable-diffusion-webUI, not just reloaded the UI? Yes

Have you updated Dreambooth to the latest revision? Yes

Have you updated the Stable-Diffusion-WebUI to the latest version? Yes

No, really. Please save us both some trouble and update the SD-WebUI and Extension and restart before posting this. Reply 'OK' Below to acknowledge that you did this. OK

Describe the bug

When training using Adam 8-bit an errror message pops up saying there was an error loading a ddl from bitsandbytes, then it continues but runs out of GPU memory. I am using xformers and the recommended settings for low VRAM usage.

Provide logs

_CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)! CUDA SETUP: Loading binary D:\Programs\stable-diffusion-webui\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so... Exception importing 8bit adam: [WinError 193] %1 is not a valid application Scheduler Loaded Allocated: 0.2GB Reserved: 0.2GB

Running training Num examples = 1500 Num batches each epoch = 1500 Num Epochs = 1 Instantaneous batch size per device = 1 Total train batch size (w. parallel, distributed & accumulation) = 1 Gradient Accumulation steps = 1 Total optimization steps = 1200 Steps: 0%| | 0/1200 [00:00<?, ?it/s] Exception while training: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 16.00 GiB total capacity; 14.96 GiB already allocated; 0 bytes free; 15.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOCCONF Allocated: 14.9GB Reserved: 15.2GB

_Traceback (most recent call last): File "D:\Programs\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 986, in main optimizer.step() File "D:\Programs\stable-diffusion-webui\venv\lib\site-packages\accelerate\optimizer.py", line 134, in step self.scaler.step(self.optimizer, closure) File "D:\Programs\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\amp\grad_scaler.py", line 338, in step retval = self._maybe_opt_step(optimizer, optimizer_state, *args, kwargs) File "D:\Programs\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\amp\grad_scaler.py", line 285, in _maybe_opt_step retval = optimizer.step(*args, *kwargs) File "D:\Programs\stable-diffusion-webui\venv\lib\site-packages\torch\optim\lr_scheduler.py", line 65, in wrapper return wrapped(args, kwargs) File "D:\Programs\stable-diffusion-webui\venv\lib\site-packages\torch\optim\optimizer.py", line 113, in wrapper return func(*args, *kwargs) File "D:\Programs\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, **kwargs) File "D:\Programs\stable-diffusion-webui\venv\lib\site-packages\torch\optim\adamw.py", line 161, in step adamw(params_with_grad, File "D:\Programs\stable-diffusion-webui\venv\lib\site-packages\torch\optim\adamw.py", line 218, in adamw func(params, File "D:\Programs\stable-diffusion-webui\venv\lib\site-packages\torch\optim\adamw.py", line 309, in _single_tensor_adamw denom = (exp_avg_sq.sqrt() / bias_correction2sqrt).add(eps) RuntimeError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 16.00 GiB total capacity; 14.96 GiB already allocated; 0 bytes free; 15.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF CLEANUP: Allocated: 14.9GB Reserved: 15.2GB

Cleanup Complete. Allocated: 14.9GB Reserved: 15.2GB

Steps: 0%| | 0/1200 [00:28<?, ?it/s] Training completed, reloading SD Model. Allocated: 0.0GB Reserved: 7.6GB

Memory output: {'VRAM cleared.': '0.0/0.0GB', 'Loaded model.': '0.0/0.0GB', 'Scheduler Loaded': '0.2/0.2GB', 'Exception while training: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 16.00 GiB total capacity; 14.96 GiB already allocated; 0 bytes free; 15.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOCCONF': '14.9/15.2GB', 'CLEANUP: ': '14.9/15.2GB', 'Cleanup Complete.': '14.9/15.2GB', 'Training completed, reloading SD Model.': '0.0/7.6GB'} Re-applying optimizations... Returning result: Training finished. Total lifetime steps: 0

Environment

What OS? Windows

If Windows - WSL or native? Native

What GPU are you using? 3080 Mobile 16gb VRAM

Screenshots/Config If the issue is specific to an error while training, please provide a screenshot of training parameters or the db_config.json file from /models/dreambooth/MODELNAME/db_config.json

model

saruzaru commented 1 year ago

same here, i run it on 3080ti 12GB though

ProgrammingDinosaur commented 1 year ago

same here, i run it on 3080ti 12GB though

I checked these two similar issues, but they didn't help me, maybe they could be useful to you.

https://github.com/d8ahazard/sd_dreambooth_extension/issues/7 https://github.com/d8ahazard/sd_dreambooth_extension/issues/3

EDIT: I followed this comment on issue #7 and I bypassed it _"I fixed this by changing the line in cextension.py:

binary_name = evaluate_cuda_setup()

to hardcode the cuda DLL path to libbitsandbytescuda116.dll"

I'll leave the issue open as it's still an error that needs to be adressed I guess.

poondoggle commented 1 year ago

I am also experiencing the same issue on my new 12GB 3060. Since this was fixed 9 days ago, I'm reluctant to hardcode paths or use custom .dll's as per #7.

d8ahazard commented 1 year ago

Yep, I suck at life. Tried to make the install part smarter, but broke it worse. Fixed it with e66b34b.

YakuzaSuske commented 1 year ago

Yep, I suck at life. Tried to make the install part smarter, but broke it worse. Fixed it with e66b34b.

Dude....i feel you. Don't worry we ain't mad at you or anything, everybody makes mistakes. Tho..... sometimes your mistakes are funny and i laugh when i encounter new issues whenever i update 😂. You have to be absolutely tired of saying "There, done. Now to slee-....oh, more issues". It's just too funny for me but, again this is stil WIP, and respect you. Because for one I can't complain because im able to run Dreambooth thanks to you and two you really are dedicated to this and i respect that.

ProgrammingDinosaur commented 1 year ago

Yep, I suck at life. Tried to make the install part smarter, but broke it worse. Fixed it with e66b34b.

No worries! What you are doing is an awesome work. Thanks for taking the time for fixing the issue.

d8ahazard / sd_dreambooth_extension

Adam 8-bit showing error and then running out of memory #237