IrisRainbowNeko / DreamArtist-stable-diffusion

stable diffusion webui with contrastive prompt tuning
873 stars 53 forks source link

can't train cause of GPU memory issue #11

Open NeoNeetPro opened 2 years ago

NeoNeetPro commented 2 years ago

Hi. I tried dreamartist, but I get a GPU out of memory error. The error code is below. Can you please advise me?

Python 3.10.7 (tags/v3.10.7:6cc6b13, Sep 5 2022, 14:08:36) [MSC v.1933 64 bit (AMD64)]

Training at rate of 0.005 until step 3000 Preparing dataset... 100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.99s/it] 0%| | 0/3000 [00:03<?, ?it/s] Applying cross attention optimization (Doggettx). Error completing request Arguments: ('test1', '0.005', 1, 'D:\Program Files\stable-diffusion-webui-dream\extensions\DreamArtist\imgs\train', 'textual_inversion', 512, 512, 3000, 500, 500, 'D:\Program Files\stable-diffusion-webui-dream\textual_inversion_templates\style_filewords.txt', True, False, '', '', 20, 0, 7, -1.0, 512, 512, 5.0, '', True, False, 1, 1) {} Traceback (most recent call last): File "D:\Program Files\stable-diffusion-webui-dream\modules\ui.py", line 185, in f res = list(func(*args, *kwargs)) File "D:\Program Files\stable-diffusion-webui-dream\webui.py", line 54, in f res = func(args, *kwargs) File "D:\Program Files\stable-diffusion-webui-dream\extensions\DreamArtist\scripts\dream_artist\ui.py", line 30, in train_embedding embedding, filename = dream_artist.cptuning.train_embedding(args) File "D:\Program Files\stable-diffusion-webui-dream\extensions\DreamArtist\scripts\dream_artist\cptuning.py", line 430, in train_embedding loss.backward() File "D:\Program Files\stable-diffusion-webui-dream\venv\lib\site-packages\torch_tensor.py", line 396, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "D:\Program Files\stable-diffusion-webui-dream\venv\lib\site-packages\torch\autogradinit.py", line 173, in backward Variable.execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "D:\Program Files\stable-diffusion-webui-dream\venv\lib\site-packages\torch\autograd\function.py", line 253, in apply return user_fn(self, *args) File "D:\Program Files\stable-diffusion-webui-dream\repositories\stable-diffusion\ldm\modules\diffusionmodules\util.py", line 139, in backward input_grads = torch.autograd.grad( File "D:\Program Files\stable-diffusion-webui-dream\venv\lib\site-packages\torch\autograd_init.py", line 276, in grad return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 8.00 GiB total capacity; 5.31 GiB already allocated; 0 bytes free; 6.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

mekrod commented 2 years ago

as the log says, you need 1024mb more of free vram to run it, . if you can free that vram (if it is used by another software) you can run it, otherwise you can't and you need a 10 gb + vram card, at least with that settings.

jhamilton0 commented 2 years ago

It's possible to get it running on an 8GB video card by using a very small input image- under 300x300 pixels when not using reconstruction, and under 200x200 when that's on.

Unfortunately, after several tests, I haven't gotten any good results from doing that- even after 10+ hours of training, the embeddings seem to produce mostly just random shapes and textures. I'm not sure if that's an issue with the small resolution of the input images, the low VRAM, the content of the input images, or if I've just set something up incorrectly, however.

NeoNeetPro commented 2 years ago

Thanks for the advice! I have a GTX1070 8GB, so it seemed to lack GPU memory. I was able to start the train with a resolution of 384*384. I trained about 3000steps but did not get good results....

IrisRainbowNeko commented 2 years ago

It's possible to get it running on an 8GB video card by using a very small input image- under 300x300 pixels when not using reconstruction, and under 200x200 when that's on.

Unfortunately, after several tests, I haven't gotten any good results from doing that- even after 10+ hours of training, the embeddings seem to produce mostly just random shapes and textures. I'm not sure if that's an issue with the small resolution of the input images, the low VRAM, the content of the input images, or if I've just set something up incorrectly, however.

The original instructions are quite rough and you may not use DreamArtist correctly. You can use it following the new instructions.

IrisRainbowNeko commented 2 years ago

Thanks for the advice! I have a GTX1070 8GB, so it seemed to lack GPU memory. I was able to start the train with a resolution of 384*384. I trained about 3000steps but did not get good results....

The original instructions are quite rough and you may not use DreamArtist correctly. You can use it following the new instructions.

RevyaM commented 1 year ago

Got the same result running on rtx 3060ti with xformers following the new instructions. Tried to use different models (with less and less vram usage, going from 6 to 4 to 2 gig models). The result is pretty much the same: dreamartist fills the unused vram (up to 7.2-7.3 gigs of total usage), then gives an error mentioned in the post. Also, the vram refuses to unload itself unless you close webui.

launch args: set COMMANDLINE_ARGS=--opt-split-attention --xformers --autolaunch set ACCELERATE= set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128

the error itself: 20, 0, 7, -1.0, 512, 512, '5.0', '', True, False, 1, 1, 1.0, 25.0, 1.0, 25.0, 0.9, 0.999, False, 1, False, '0.000005') {} Traceback (most recent call last): File "D:\stable-diffusion-webui\modules\call_queue.py", line 45, in f res = list(func(*args, *kwargs)) File "D:\stable-diffusion-webui\modules\call_queue.py", line 28, in f res = func(args, *kwargs) File "D:\stable-diffusion-webui\extensions\DreamArtist-sd-webui-extension\scripts\dream_artist\ui.py", line 30, in train_embedding embedding, filename = dream_artist.cptuning.train_embedding(args) File "D:\stable-diffusion-webui\extensions\DreamArtist-sd-webui-extension\scripts\dream_artist\cptuning.py", line 542, in train_embedding loss.backward() File "D:\stable-diffusion-webui\venv\lib\site-packages\torch_tensor.py", line 396, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "D:\stable-diffusion-webui\venv\lib\site-packages\torch\autograd__init.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "D:\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\function.py", line 253, in apply return user_fn(self, *args) File "D:\stable-diffusion-webui\venv\lib\site-packages\torch\utils\checkpoint.py", line 146, in backward torch.autograd.backward(outputs_with_grad, args_with_grad) File "D:\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\init__.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 8.00 GiB total capacity; 6.32 GiB already allocated; 0 bytes free; 6.43 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF