[Bug]: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 14.00 MiB. GPU 0 has a total capacty of 12.00 GiB of which 4.65 GiB is free

silkysmoothgames commented 6 days ago

Checklist

[X] The issue exists after disabling all extensions
[X] The issue exists on a clean installation of webui
[ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
[ ] The issue exists in the current version of the webui
[X] The issue has not been reported before recently
[ ] The issue has been reported before but has not been fixed yet

What happened?

I used 1.8.0 version everything were ok. Then o tried to install Ultimate SD Upscaler and my webui broke. After that i removed everything from PC, include GIT, Python, all cache, made clean webui install but it does not help. 1.5 models load normal, but when i try to load ANY XL model i got this:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 14.00 MiB. GPU 0 has a total capacty of 12.00 GiB of which 4.65 GiB is free. Of the allocated memory 5.09 GiB is allocated by PyTorch, and 154.42 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Please help me i don't know why this happen, i tried 1.8.0 again but got same error. 1.5 works pretty fine somehow but xl just don't load.

Steps to reproduce the problem

Error

What should have happened?

Work like before

What browsers do you use to access the UI ?

Google Chrome

Sysinfo

Ryzen 2600 Geforce 3060 12gb Windows 10 16gb Ram

Console logs

Version: v1.9.4
Commit hash: feee37d75f1b168768014e4634dcb156ee649c05
Launching Web UI with arguments: --opt-sdp-attention
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Loading weights [71e14760e2] from D:\automatic 1.94\stable-diffusion-webui\models\Stable-diffusion\lazymixRealAmateur_v30b.safetensors
Creating model from config: D:\automatic 1.94\stable-diffusion-webui\configs\v1-inference.yaml
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
D:\automatic 1.94\stable-diffusion-webui\venv\lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Startup time: 20.1s (prepare environment: 3.7s, import torch: 5.1s, import gradio: 1.3s, setup paths: 1.5s, initialize shared: 1.9s, other imports: 0.9s, load scripts: 1.4s, create ui: 0.5s, gradio launch: 3.8s).
Applying attention optimization: sdp... done.
Model loaded in 10.5s (load weights from disk: 0.7s, create model: 0.6s, apply weights to model: 7.5s, calculate empty prompt: 1.6s).
Reusing loaded model lazymixRealAmateur_v30b.safetensors [71e14760e2] to load xl\ponyDiffusionV6XL_v6StartWithThisOne.safetensors
Calculating sha256 for D:\automatic 1.94\stable-diffusion-webui\models\Stable-diffusion\xl\ponyDiffusionV6XL_v6StartWithThisOne.safetensors: 67ab2fd8ec439a89b3fedb15cc65f54336af163c7eb5e4f2acc98f090a29b0b3
Loading weights [67ab2fd8ec] from D:\automatic 1.94\stable-diffusion-webui\models\Stable-diffusion\xl\ponyDiffusionV6XL_v6StartWithThisOne.safetensors
Creating model from config: D:\automatic 1.94\stable-diffusion-webui\repositories\generative-models\configs\inference\sd_xl_base.yaml
changing setting sd_model_checkpoint to xl\ponyDiffusionV6XL_v6StartWithThisOne.safetensors: OutOfMemoryError
Traceback (most recent call last):
  File "D:\automatic 1.94\stable-diffusion-webui\modules\options.py", line 165, in set
    option.onchange()
  File "D:\automatic 1.94\stable-diffusion-webui\modules\call_queue.py", line 13, in f
    res = func(*args, **kwargs)
  File "D:\automatic 1.94\stable-diffusion-webui\modules\initialize_util.py", line 181, in <lambda>
    shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: sd_models.reload_model_weights()), call=False)
  File "D:\automatic 1.94\stable-diffusion-webui\modules\sd_models.py", line 879, in reload_model_weights
    load_model(checkpoint_info, already_loaded_state_dict=state_dict)
  File "D:\automatic 1.94\stable-diffusion-webui\modules\sd_models.py", line 748, in load_model
    load_model_weights(sd_model, checkpoint_info, state_dict, timer)
  File "D:\automatic 1.94\stable-diffusion-webui\modules\sd_models.py", line 393, in load_model_weights
    model.load_state_dict(state_dict, strict=False)
  File "D:\automatic 1.94\stable-diffusion-webui\modules\sd_disable_initialization.py", line 223, in <lambda>
    module_load_state_dict = self.replace(torch.nn.Module, 'load_state_dict', lambda *args, **kwargs: load_state_dict(module_load_state_dict, *args, **kwargs))
  File "D:\automatic 1.94\stable-diffusion-webui\modules\sd_disable_initialization.py", line 221, in load_state_dict
    original(module, state_dict, strict=strict)
  File "D:\automatic 1.94\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2138, in load_state_dict
    load(self, state_dict)
  File "D:\automatic 1.94\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2126, in load
    load(child, child_state_dict, child_prefix)
  File "D:\automatic 1.94\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2126, in load
    load(child, child_state_dict, child_prefix)
  File "D:\automatic 1.94\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2126, in load
    load(child, child_state_dict, child_prefix)
  [Previous line repeated 6 more times]
  File "D:\automatic 1.94\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 2120, in load
    module._load_from_state_dict(
  File "D:\automatic 1.94\stable-diffusion-webui\modules\sd_disable_initialization.py", line 225, in <lambda>
    linear_load_from_state_dict = self.replace(torch.nn.Linear, '_load_from_state_dict', lambda *args, **kwargs: load_from_state_dict(linear_load_from_state_dict, *args, **kwargs))
  File "D:\automatic 1.94\stable-diffusion-webui\modules\sd_disable_initialization.py", line 191, in load_from_state_dict
    module._parameters[name] = torch.nn.parameter.Parameter(torch.zeros_like(param, device=device, dtype=dtype), requires_grad=param.requires_grad)
  File "D:\automatic 1.94\stable-diffusion-webui\venv\lib\site-packages\torch\_meta_registrations.py", line 4507, in zeros_like
    res = aten.empty_like.default(
  File "D:\automatic 1.94\stable-diffusion-webui\venv\lib\site-packages\torch\_ops.py", line 448, in __call__
    return self._op(*args, **kwargs or {})
  File "D:\automatic 1.94\stable-diffusion-webui\venv\lib\site-packages\torch\_refs\__init__.py", line 4681, in empty_like
    return torch.empty_permuted(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 14.00 MiB. GPU 0 has a total capacty of 12.00 GiB of which 4.65 GiB is free. Of the allocated memory 5.09 GiB is allocated by PyTorch, and 154.42 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Additional information

No response

bhanuj10 commented 5 days ago

try disabling extension and keep only one model in the models folder and another doubt is that you may be using your iGPU for loading models try adding this to the webui file "--execution-provider CUDA" try defaulting the setting options

silkysmoothgames commented 5 days ago

try disabling extension and keep only one model in the models folder and another doubt is that you may be using your iGPU for loading models try adding this to the webui file "--execution-provider CUDA" try defaulting the setting options

I reinstalled webui, git, python and all libraries from scratch it does not help. There is no other extensions except inbuilt, also i tried to disable them. Nothing still works. I tried to launch webui with "--execution-provider CUDA" but got this error: launch.py: error: unrecognized arguments: --execution-provider cuda It seems something broke in my windows but i don't understand what.

silkysmoothgames commented 5 days ago

It seems, something happened with memory and swap file in windows. I increased size manually and now it works fine, but uses in peak Full 12gb VRAM and 40gb RAM+Swap file(16RAM+24swap). Is it ok?

bhanuj10 commented 5 days ago

no it not okay i have a laptop with 8gb vram and 16gb ram it occupies only 4-5gb of vram and 4gb ram maybe you are mentioning the totall of usage of ram at that time and not what the application is actually using

--opt-sdp-attention https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Optimizations this link can tell u about optimization feature in sd

since u r using the the above arg this error could have happened

please try without and tell me the results

my another suggestion is that if you are not using a venv then use it

cd stable-diffusion

python -m venv venv

cd venv/scripts/

.\activate

cd ../../

.\webui.sh

the libraries should be installed the venv so that you can delete the sd with the installed libraries without leaving any unneccessary library ( storage space problem )

silkysmoothgames commented 5 days ago

maybe you are mentioning the totall of usage of ram at that time and not what the application is actually using

Yeah but i close any other application when i work with SD because i have only 16gb ram. So 80-90% of total mem usage is on SD

please try without and tell me the results

I changed --opt-sdp-attention to --xformers seems like not much changed in memory use. Still it consumes up to 40gb total memory usage when i try to generate image 1024x1024 with XL model. Very weird. Before i tried to install Ultimate SD Upscale everything was ok. Maybe that extension broke something in my Windows?

my another suggestion is that if you are not using a venv then use it

Hmm i think i do, because all libraries installed in that directory.

bhanuj10 commented 4 days ago

i too use ultimate sd upscale but i didn't get any issues like this with smaller model maybe it is possible with xl model XL models in stable diffusion can indeed consume a significant amount of memory and VRAM due to their size and complexity. can u tell me exactly which model do u use

silkysmoothgames commented 4 days ago

i too use ultimate sd upscale but i didn't get any issues like this with smaller model maybe it is possible with xl model XL models in stable diffusion can indeed consume a significant amount of memory and VRAM due to their size and complexity. can u tell me exactly which model do u use

Any XL model. For example Pony XL. I don't use any extensions now except built in, so may be something broke in windows but don't know what.

AUTOMATIC1111 / stable-diffusion-webui