vladmandic / automatic

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
https://github.com/vladmandic/automatic
GNU Affero General Public License v3.0
5.72k stars 426 forks source link

[Issue]: What kind of default settings it uses that demand 14 Gb VRAM for SD1.5 model? #3572

Open Seedmanc opened 4 hours ago

Seedmanc commented 4 hours ago

Issue Description

I installed it via Pinokio, and running anything above 512x512 batch 1 does OOM at my 8Gb 2070S, I never had anything like that with other SD software. Even enabling Nvidia's system fallback doesn't help, this is ridiculous. How do I configure it so that it doesn't perform worse than others? I went for SD Next hoping I'd be able to run modern models that Forge can't and here I'm left unable to use even the oldest one. There are way too many settings to brute-force through their combinations on my own.

Version Platform Description

Win10 x64, 20GB RAM. 20:58:26-298795 INFO Starting SD.Next 20:58:26-308950 INFO Logger: file="f:\pinokio\api\SD-Next.git\app\sdnext.log" level=INFO size=97458 mode=append 20:58:26-308950 INFO Python: version=3.10.15 platform=Windows bin="f:\pinokio\api\SD-Next.git\app\venv\Scripts\python.exe" venv="f:\pinokio\api\SD-Next.git\app\venv" 20:58:26-488706 INFO Version: app=sd.next updated=2024-11-02 hash=65ddc611 branch=master url=https://github.com/vladmandic/automatic/tree/master ui=main 20:58:27-278944 INFO Platform: arch=AMD64 cpu=Intel64 Family 6 Model 42 Stepping 7, GenuineIntel system=Windows release=Windows-10-10.0.19045-SP0 python=3.10.15 20:58:27-288602 INFO Args: ['--use-cuda'] 20:58:27-333397 INFO CUDA: nVidia toolkit detected 20:58:27-598652 INFO Verifying requirements 20:58:27-608594 INFO Verifying packages 20:58:27-668815 INFO Extensions: disabled=[] 20:58:27-668815 INFO Extensions: enabled=['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg'] extensions-builtin 20:58:27-678921 INFO Extensions: enabled=[] extensions 20:58:27-678921 INFO Startup: quick launch 20:58:27-678921 INFO Extensions: disabled=[] 20:58:27-678921 INFO Extensions: enabled=['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg'] extensions-builtin 20:58:27-688612 INFO Extensions: enabled=[] extensions 20:58:27-698470 INFO Command line args: ['--use-cuda'] use_cuda=True 20:58:38-309021 INFO System packages: {'torch': '2.5.1+cu124', 'diffusers': '0.32.0.dev0', 'gradio': '3.43.2', 'transformers': '4.46.1', 'accelerate': '1.0.1'} 20:58:39-498829 INFO Device detect: memory=8.0 optimization=medvram 20:58:39-509009 INFO Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="Scaled-Dot-Product" mode=no_grad 20:58:39-569152 INFO Torch parameters: backend=cuda device=cuda config=Auto dtype=torch.bfloat16 vae=torch.bfloat16 unet=torch.bfloat16 context=no_grad nohalf=False nohalfvae=False upscast=False deterministic=False test-fp16=True test-bf16=True optimization="Scaled-Dot-Product" 20:58:40-428618 INFO Device: device=NVIDIA GeForce RTX 2070 SUPER n=1 arch=sm_90 capability=(7, 5) cuda=12.4 cudnn=90100 driver=560.81 20:58:40-678641 INFO Available VAEs: path="models\VAE" items=0 20:58:40-678641 INFO Available UNets: path="models\UNET" items=0 20:58:40-678641 INFO Available TEs: path="models\Text-encoder" items=0 20:58:40-678641 INFO Disabled extensions: ['sdnext-modernui'] 20:58:40-708635 INFO Available Models: path="models\Stable-diffusion" items=30 time=0.02 20:58:40-828603 INFO Available Yolo: path="models\yolo" items=6 downloaded=0 20:58:40-968287 INFO Available LoRAs: path="models\Lora" items=0 folders=2 time=0.00 20:58:42-178467 INFO Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3 20:58:42-218510 INFO Available Upscalers: items=53 downloaded=0 user=0 time=0.04 types=['None', 'Lanczos', 'Nearest', 'ChaiNNer', 'AuraSR', 'ESRGAN', 'LDSR', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR'] 20:58:42-248951 INFO Available Styles: folder="models\styles" items=288 time=0.03 20:58:42-258470 INFO UI theme: type=Standard name="black-teal" 20:58:44-779036 INFO Extension list is empty: refresh required 20:58:45-958965 INFO Local URL: http://127.0.0.1:7860/ 20:58:46-409037 INFO [AgentScheduler] Task queue is empty 20:58:46-409037 INFO [AgentScheduler] Registering APIs 20:58:46-648893 INFO Load model: select="stable-diffusion!sd-1-4 [764ebd128c]" 20:58:46-658906 INFO Autodetect model: detect="Stable Diffusion" class=StableDiffusionPipeline file="f:\pinokio\api\SD-Next.git\app\models\Stable-diffusion\stable-diffusion!sd-1-4.safetensors" size=2034MB Diffusers 3.19it/s ████ 50% 3/6 00:00 00:00 Loading pipeline components... Diffusers 1.30s/it ████████ 100% 6/6 00:07 00:00 Loading pipeline components... 20:58:54-790835 INFO Load network: type=embeddings loaded=0 skipped=0 time=0.00 20:58:55-128450 INFO Load model: time=8.19 load=8.13 native=512 memory={'ram': {'used': 2.98, 'total': 19.97}, 'gpu': {'used': 1.06, 'total': 8.0}, 'retries': 0, 'oom': 0} 20:58:55-128450 INFO Startup time: 27.43 torch=8.02 onnx=0.05 gradio=2.05 diffusers=0.23 libraries=2.55 samplers=0.05 extensions=1.34 detailer=0.13 ui-networks=0.36 ui-txt2img=0.36 ui-img2img=0.20 ui-control=0.38 ui-extras=0.10 ui-models=0.12
ui-gallery=0.08 ui-settings=0.88 ui-extensions=0.58 ui-defaults=0.38 launch=0.28 api=0.27 app-started=0.41 checkpoint=8.48 20:59:39-234929 INFO MOTD: N/A 20:59:45-526194 INFO Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Pinokio/2.15.1 Chrome/110.0.5481.208 Electron/23.3.13 Safari/537.36

Relevant log output

20:58:54-790835 INFO     Load network: type=embeddings loaded=0 skipped=0 time=0.00
20:58:55-128450 INFO     Load model: time=8.19 load=8.13 native=512 memory={'ram': {'used': 2.98, 'total': 19.97}, 'gpu': {'used': 1.06, 'total': 8.0}, 'retries': 0, 'oom': 0}
20:58:55-128450 INFO     Startup time: 27.43 torch=8.02 onnx=0.05 gradio=2.05 diffusers=0.23 libraries=2.55 samplers=0.05 extensions=1.34 detailer=0.13 ui-networks=0.36 ui-txt2img=0.36 ui-img2img=0.20 ui-control=0.38 ui-extras=0.10 ui-models=0.12  
                         ui-gallery=0.08 ui-settings=0.88 ui-extensions=0.58 ui-defaults=0.38 launch=0.28 api=0.27 app-started=0.41 checkpoint=8.48
20:59:39-234929 INFO     MOTD: N/A
20:59:45-526194 INFO     Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Pinokio/2.15.1 Chrome/110.0.5481.208 Electron/23.3.13 Safari/537.36

21:01:50-947545 INFO     Base: class=StableDiffusionPipeline
Progress  1.20s/it █████████████████████████████████ 100% 20/20 00:23 00:00 Base
21:02:17-074698 INFO     Processed: images=0 its=0.00 time=26.14 timers={'encode': 1.04, 'args': 1.17, 'pipeline': 24.91, 'process': 0.02} memory={'ram': {'used': 3.63, 'total': 19.97}, 'gpu': {'used': 1.19, 'total': 8.0}, 'retries': 0, 'oom': 0}  

21:02:53-106999 INFO     Base: class=StableDiffusionPipeline
Progress ?it/s                                              0% 0/20 00:00 ? Base
21:02:54-570948 ERROR    Processing: step=base args={'prompt_embeds': 'cuda:0:torch.bfloat16:torch.Size([2, 77, 768])', 'negative_prompt_embeds': 'cuda:0:torch.bfloat16:torch.Size([2, 77, 768])', 'guidance_scale': 6, 'generator':
                         [<torch._C.Generator object at 0x000001CF3F544A90>, <torch._C.Generator object at 0x000001CF3F5441F0>], 'callback_on_step_end': <function diffusers_callback at 0x000001CF74877640>, 'callback_on_step_end_tensor_inputs':     
                         ['latents', 'prompt_embeds', 'negative_prompt_embeds'], 'num_inference_steps': 20, 'eta': 1.0, 'guidance_rescale': 0.7, 'output_type': 'latent', 'width': 640, 'height': 640} CUDA out of memory. Tried to allocate 1.22 GiB.  
                         GPU 0 has a total capacity of 8.00 GiB of which 219.00 MiB is free. Of the allocated memory 6.65 GiB is allocated by PyTorch, and 54.79 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large
                         try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
21:02:54-578845 ERROR    Processing: OutOfMemoryError
╭───────────────────────────────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ F:\pinokio\api\SD-Next.git\app\modules\processing_diffusers.py:99 in process_base                                                                                                                                                                    │
│                                                                                                                                                                                                                                                      │
│    98 │   │   else:                                                                                                                                                                                                                                  │
│ ❱  99 │   │   │   output = shared.sd_model(**base_args)                                                                                                                                                                                              │
│   100 │   │   if isinstance(output, dict):                                                                                                                                                                                                           │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\utils\_contextlib.py:116 in decorate_context                                                                                                                                             │
│                                                                                                                                                                                                                                                      │
│   115 │   │   with ctx_factory():                                                                                                                                                                                                                    │
│ ❱ 116 │   │   │   return func(*args, **kwargs)                                                                                                                                                                                                       │
│   117                                                                                                                                                                                                                                                │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py:1020 in __call__                                                                                                             │
│                                                                                                                                                                                                                                                      │
│   1019 │   │   │   │   # predict the noise residual                                                                                                                                                                                                  │
│ ❱ 1020 │   │   │   │   noise_pred = self.unet(                                                                                                                                                                                                       │
│   1021 │   │   │   │   │   latent_model_input,                                                                                                                                                                                                       │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                                                                                                                                          │
│                                                                                                                                                                                                                                                      │
│   1735 │   │   else:                                                                                                                                                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                                                                                                                           │
│   1737                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                                                                                                                                  │
│                                                                                                                                                                                                                                                      │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                                                                       │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                                                                                                                                                              │
│   1748                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\accelerate\hooks.py:170 in new_forward                                                                                                                                                         │
│                                                                                                                                                                                                                                                      │
│   169 │   │   else:                                                                                                                                                                                                                                  │
│ ❱ 170 │   │   │   output = module._old_forward(*args, **kwargs)                                                                                                                                                                                      │
│   171 │   │   return module._hf_hook.post_forward(module, output)                                                                                                                                                                                    │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\unets\unet_2d_condition.py:1216 in forward                                                                                                                                    │
│                                                                                                                                                                                                                                                      │
│   1215 │   │   │   │                                                                                                                                                                                                                                 │
│ ❱ 1216 │   │   │   │   sample, res_samples = downsample_block(                                                                                                                                                                                       │
│   1217 │   │   │   │   │   hidden_states=sample,                                                                                                                                                                                                     │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                                                                                                                                          │
│                                                                                                                                                                                                                                                      │
│   1735 │   │   else:                                                                                                                                                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                                                                                                                           │
│   1737                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│                                                                                                               ... 4 frames hidden ...                                                                                                                │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\transformers\transformer_2d.py:442 in forward                                                                                                                                 │
│                                                                                                                                                                                                                                                      │
│   441 │   │   │   else:                                                                                                                                                                                                                              │
│ ❱ 442 │   │   │   │   hidden_states = block(                                                                                                                                                                                                         │
│   443 │   │   │   │   │   hidden_states,                                                                                                                                                                                                             │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                                                                                                                                          │
│                                                                                                                                                                                                                                                      │
│   1735 │   │   else:                                                                                                                                                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                                                                                                                           │
│   1737                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                                                                                                                                  │
│                                                                                                                                                                                                                                                      │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                                                                       │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                                                                                                                                                              │
│   1748                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\attention.py:507 in forward                                                                                                                                                   │
│                                                                                                                                                                                                                                                      │
│    506 │   │                                                                                                                                                                                                                                         │
│ ❱  507 │   │   attn_output = self.attn1(                                                                                                                                                                                                             │
│    508 │   │   │   norm_hidden_states,                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                                                                                                                                          │
│                                                                                                                                                                                                                                                      │
│   1735 │   │   else:                                                                                                                                                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                                                                                                                           │
│   1737                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                                                                                                                                  │
│                                                                                                                                                                                                                                                      │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                                                                       │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                                                                                                                                                              │
│   1748                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\attention_processor.py:495 in forward                                                                                                                                         │
│                                                                                                                                                                                                                                                      │
│    494 │   │                                                                                                                                                                                                                                         │
│ ❱  495 │   │   return self.processor(                                                                                                                                                                                                                │
│    496 │   │   │   self,                                                                                                                                                                                                                             │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\attention_processor.py:2477 in __call__                                                                                                                                       │
│                                                                                                                                                                                                                                                      │
│   2476 │   │   # TODO: add support for attn.scale when we move to Torch 2.1                                                                                                                                                                          │
│ ❱ 2477 │   │   hidden_states = F.scaled_dot_product_attention(                                                                                                                                                                                       │
│   2478 │   │   │   query, key, value, attn_mask=attention_mask, dropout_p=0.0, is_causal=False                                                                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
OutOfMemoryError: CUDA out of memory. Tried to allocate 1.22 GiB. GPU 0 has a total capacity of 8.00 GiB of which 219.00 MiB is free. Of the allocated memory 6.65 GiB is allocated by PyTorch, and 54.79 MiB is reserved by PyTorch but unallocated. If
 reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)        
21:02:56-921019 INFO     Processed: images=0 its=0.00 time=3.82 timers={'gc': 0.36, 'encode': 0.54, 'args': 0.56, 'process': 3.23} memory={'ram': {'used': 3.62, 'total': 19.97}, 'gpu': {'used': 2.89, 'total': 8.0}, 'retries': 1, 'oom': 1}

Backend

Diffusers

UI

Standard

Branch

Master

Model

StableDiffusion 1.5

Acknowledgements

vladmandic commented 2 hours ago

never had anything like that with other SD software. Even enabling Nvidia's system fallback doesn't help, this is ridiculous. How do I configure it so that it doesn't perform worse than others?

i can understand the frustration, but i still do not like the entitled tone. above anything, this is free and open source and relies on its community more than anything.

now, if you actually want to change the tone and proceed with troubleshooting, run with --debug flag and upload a full log here as i cannot see params used when oom happened.