[Issue]: What kind of default settings it uses that demand 14 Gb VRAM for SD1.5 model?

Issue Description

I installed it via Pinokio, and running anything above 512x512 batch 1 does OOM at my 8Gb 2070S, I never had anything like that with other SD software. Even enabling Nvidia's system fallback doesn't help, this is ridiculous. How do I configure it so that it doesn't perform worse than others? I went for SD Next hoping I'd be able to run modern models that Forge can't and here I'm left unable to use even the oldest one. There are way too many settings to brute-force through their combinations on my own.

Version Platform Description

Win10 x64, 20GB RAM. 20:58:26-298795 INFO 20:58:26-308950 INFO 20:58:26-308950 INFO 20:58:26-488706 INFO 20:58:27-278944 INFO 20:58:27-288602 INFO 20:58:27-333397 INFO 20:58:27-598652 INFO 20:58:27-608594 INFO 20:58:27-668815 INFO 20:58:27-668815 INFO 20:58:27-678921 INFO 20:58:27-678921 INFO 20:58:27-678921 INFO 20:58:27-678921 INFO 20:58:27-688612 INFO 20:58:27-698470 INFO 20:58:38-309021 INFO 20:58:39-498829 INFO 20:58:39-509009 INFO 20:58:39-569152 INFO optimization="S 20:58:40-428618 INFO 20:58:40-678641 INFO 20:58:40-678641 INFO 20:58:40-678641 INFO 20:58:40-678641 INFO 20:58:40-708635 INFO 20:58:40-828603 INFO 20:58:40-968287 INFO 20:58:42-178467 INFO 20:58:42-218510 INFO 20:58:42-248951 INFO 20:58:42-258470 INFO 20:58:44-779036 INFO 20:58:45-958965 INFO 20:58:46-409037 INFO 20:58:46-409037 INFO 20:58:46-648893 INFO 20:58:46-658906 INFO Diffusers 3.19it/s ████ Diffusers 20:58:54-790835 INFO 20:58:55-128450 INFO 20:58:55-128450 INFO ui-gallery=0.08 20:59:39-234929 INFO MOTD: N/A 20:59:45-526194 INFO Starting SD.Next Logger: file="f:\pinokio\api\SD-Next.git\app\sdnext.log" level=INFO size=97458 mode=append Python: version=3.10.15 platform=Windows bin="f:\pinokio\api\SD-Next.git\app\venv\Scripts\python.exe" venv="f:\pinokio\api\SD-Next.git\app\venv" Version: app=sd.next updated=2024-11-02 hash=65ddc611 branch=master url=https://github.com/vladmandic/automatic/tree/master ui=main Platform: arch=AMD64 cpu=Intel64 Family 6 Model 42 Stepping 7, GenuineIntel system=Windows release=Windows-10-10.0.19045-SP0 python=3.10.15 Args: ['--use-cuda'] CUDA: nVidia toolkit detected Verifying requirements Verifying packages Extensions: disabled=[] Extensions: enabled=['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg'] extensions-builtin Extensions: enabled=[] extensions Startup: quick launch Extensions: disabled=[] Extensions: enabled=['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg'] extensions-builtin Extensions: enabled=[] extensions Command line args: ['--use-cuda'] use_cuda=True System packages: {'torch': '2.5.1+cu124', 'diffusers': '0.32.0.dev0', 'gradio': '3.43.2', 'transformers': '4.46.1', 'accelerate': '1.0.1'} Device detect: memory=8.0 optimization=medvram Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="Scaled-Dot-Product" mode=no_grad Torch parameters: backend=cuda device=cuda config=Auto dtype=torch.bfloat16 vae=torch.bfloat16 unet=torch.bfloat16 context=no_grad nohalf=False nohalfvae=False upscast=False deterministic=False test-fp16=True test-bf16=True caled-Dot-Product" Device: device=NVIDIA GeForce RTX 2070 SUPER n=1 arch=sm_90 capability=(7, 5) cuda=12.4 cudnn=90100 driver=560.81 Available VAEs: path="models\VAE" items=0 Available UNets: path="models\UNET" items=0 Available TEs: path="models\Text-encoder" items=0 Disabled extensions: ['sdnext-modernui'] Available Models: path="models\Stable-diffusion" items=30 time=0.02 Available Yolo: path="models\yolo" items=6 downloaded=0 Available LoRAs: path="models\Lora" items=0 folders=2 time=0.00 Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3 Available Upscalers: items=53 downloaded=0 user=0 time=0.04 types=['None', 'Lanczos', 'Nearest', 'ChaiNNer', 'AuraSR', 'ESRGAN', 'LDSR', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR'] Available Styles: folder="models\styles" items=288 time=0.03 UI theme: type=Standard name="black-teal" Extension list is empty: refresh required Local URL: http://127.0.0.1:7860/ [AgentScheduler] Task queue is empty [AgentScheduler] Registering APIs Load model: select="stable-diffusion!sd-1-4 [764ebd128c]" Autodetect model: detect="Stable Diffusion" class=StableDiffusionPipeline file="f:\pinokio\api\SD-Next.git\app\models\Stable-diffusion\stable-diffusion!sd-1-4.safetensors" size=2034MB 50% 3/6 00:00 00:00 Loading pipeline components... 1.30s/it ████████ 100% 6/6 00:07 00:00 Loading pipeline components... Load network: type=embeddings loaded=0 skipped=0 time=0.00 Load model: time=8.19 load=8.13 native=512 memory={'ram': {'used': 2.98, 'total': 19.97}, 'gpu': {'used': 1.06, 'total': 8.0}, 'retries': 0, 'oom': 0} Startup time: 27.43 torch=8.02 onnx=0.05 gradio=2.05 diffusers=0.23 libraries=2.55 samplers=0.05 extensions=1.34 detailer=0.13 ui-networks=0.36 ui-txt2img=0.36 ui-img2img=0.20 ui-control=0.38 ui-extras=0.10 ui-models=0.12
ui-settings=0.88 ui-extensions=0.58 ui-defaults=0.38 launch=0.28 api=0.27 app-started=0.41 checkpoint=8.48 Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Pinokio/2.15.1 Chrome/110.0.5481.208 Electron/23.3.13 Safari/537.36

Relevant log output

20:58:54-790835 INFO     Load network: type=embeddings loaded=0 skipped=0 time=0.00
20:58:55-128450 INFO     Load model: time=8.19 load=8.13 native=512 memory={'ram': {'used': 2.98, 'total': 19.97}, 'gpu': {'used': 1.06, 'total': 8.0}, 'retries': 0, 'oom': 0}
20:58:55-128450 INFO     Startup time: 27.43 torch=8.02 onnx=0.05 gradio=2.05 diffusers=0.23 libraries=2.55 samplers=0.05 extensions=1.34 detailer=0.13 ui-networks=0.36 ui-txt2img=0.36 ui-img2img=0.20 ui-control=0.38 ui-extras=0.10 ui-models=0.12  
                         ui-gallery=0.08 ui-settings=0.88 ui-extensions=0.58 ui-defaults=0.38 launch=0.28 api=0.27 app-started=0.41 checkpoint=8.48
20:59:39-234929 INFO     MOTD: N/A
20:59:45-526194 INFO     Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Pinokio/2.15.1 Chrome/110.0.5481.208 Electron/23.3.13 Safari/537.36

21:01:50-947545 INFO     Base: class=StableDiffusionPipeline
Progress  1.20s/it █████████████████████████████████ 100% 20/20 00:23 00:00 Base
21:02:17-074698 INFO     Processed: images=0 its=0.00 time=26.14 timers={'encode': 1.04, 'args': 1.17, 'pipeline': 24.91, 'process': 0.02} memory={'ram': {'used': 3.63, 'total': 19.97}, 'gpu': {'used': 1.19, 'total': 8.0}, 'retries': 0, 'oom': 0}  

21:02:53-106999 INFO     Base: class=StableDiffusionPipeline
Progress ?it/s                                              0% 0/20 00:00 ? Base
21:02:54-570948 ERROR    Processing: step=base args={'prompt_embeds': 'cuda:0:torch.bfloat16:torch.Size([2, 77, 768])', 'negative_prompt_embeds': 'cuda:0:torch.bfloat16:torch.Size([2, 77, 768])', 'guidance_scale': 6, 'generator':
                         [<torch._C.Generator object at 0x000001CF3F544A90>, <torch._C.Generator object at 0x000001CF3F5441F0>], 'callback_on_step_end': <function diffusers_callback at 0x000001CF74877640>, 'callback_on_step_end_tensor_inputs':     
                         ['latents', 'prompt_embeds', 'negative_prompt_embeds'], 'num_inference_steps': 20, 'eta': 1.0, 'guidance_rescale': 0.7, 'output_type': 'latent', 'width': 640, 'height': 640} CUDA out of memory. Tried to allocate 1.22 GiB.  
                         GPU 0 has a total capacity of 8.00 GiB of which 219.00 MiB is free. Of the allocated memory 6.65 GiB is allocated by PyTorch, and 54.79 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large
                         try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
21:02:54-578845 ERROR    Processing: OutOfMemoryError
╭───────────────────────────────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ F:\pinokio\api\SD-Next.git\app\modules\processing_diffusers.py:99 in process_base                                                                                                                                                                    │
│                                                                                                                                                                                                                                                      │
│    98 │   │   else:                                                                                                                                                                                                                                  │
│ ❱  99 │   │   │   output = shared.sd_model(**base_args)                                                                                                                                                                                              │
│   100 │   │   if isinstance(output, dict):                                                                                                                                                                                                           │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\utils\_contextlib.py:116 in decorate_context                                                                                                                                             │
│                                                                                                                                                                                                                                                      │
│   115 │   │   with ctx_factory():                                                                                                                                                                                                                    │
│ ❱ 116 │   │   │   return func(*args, **kwargs)                                                                                                                                                                                                       │
│   117                                                                                                                                                                                                                                                │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py:1020 in __call__                                                                                                             │
│                                                                                                                                                                                                                                                      │
│   1019 │   │   │   │   # predict the noise residual                                                                                                                                                                                                  │
│ ❱ 1020 │   │   │   │   noise_pred = self.unet(                                                                                                                                                                                                       │
│   1021 │   │   │   │   │   latent_model_input,                                                                                                                                                                                                       │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                                                                                                                                          │
│                                                                                                                                                                                                                                                      │
│   1735 │   │   else:                                                                                                                                                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                                                                                                                           │
│   1737                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                                                                                                                                  │
│                                                                                                                                                                                                                                                      │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                                                                       │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                                                                                                                                                              │
│   1748                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\accelerate\hooks.py:170 in new_forward                                                                                                                                                         │
│                                                                                                                                                                                                                                                      │
│   169 │   │   else:                                                                                                                                                                                                                                  │
│ ❱ 170 │   │   │   output = module._old_forward(*args, **kwargs)                                                                                                                                                                                      │
│   171 │   │   return module._hf_hook.post_forward(module, output)                                                                                                                                                                                    │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\unets\unet_2d_condition.py:1216 in forward                                                                                                                                    │
│                                                                                                                                                                                                                                                      │
│   1215 │   │   │   │                                                                                                                                                                                                                                 │
│ ❱ 1216 │   │   │   │   sample, res_samples = downsample_block(                                                                                                                                                                                       │
│   1217 │   │   │   │   │   hidden_states=sample,                                                                                                                                                                                                     │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                                                                                                                                          │
│                                                                                                                                                                                                                                                      │
│   1735 │   │   else:                                                                                                                                                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                                                                                                                           │
│   1737                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│                                                                                                               ... 4 frames hidden ...                                                                                                                │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\transformers\transformer_2d.py:442 in forward                                                                                                                                 │
│                                                                                                                                                                                                                                                      │
│   441 │   │   │   else:                                                                                                                                                                                                                              │
│ ❱ 442 │   │   │   │   hidden_states = block(                                                                                                                                                                                                         │
│   443 │   │   │   │   │   hidden_states,                                                                                                                                                                                                             │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                                                                                                                                          │
│                                                                                                                                                                                                                                                      │
│   1735 │   │   else:                                                                                                                                                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                                                                                                                           │
│   1737                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                                                                                                                                  │
│                                                                                                                                                                                                                                                      │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                                                                       │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                                                                                                                                                              │
│   1748                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\attention.py:507 in forward                                                                                                                                                   │
│                                                                                                                                                                                                                                                      │
│    506 │   │                                                                                                                                                                                                                                         │
│ ❱  507 │   │   attn_output = self.attn1(                                                                                                                                                                                                             │
│    508 │   │   │   norm_hidden_states,                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                                                                                                                                          │
│                                                                                                                                                                                                                                                      │
│   1735 │   │   else:                                                                                                                                                                                                                                 │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                                                                                                                           │
│   1737                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                                                                                                                                  │
│                                                                                                                                                                                                                                                      │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                                                                                                                                                       │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                                                                                                                                                              │
│   1748                                                                                                                                                                                                                                               │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\attention_processor.py:495 in forward                                                                                                                                         │
│                                                                                                                                                                                                                                                      │
│    494 │   │                                                                                                                                                                                                                                         │
│ ❱  495 │   │   return self.processor(                                                                                                                                                                                                                │
│    496 │   │   │   self,                                                                                                                                                                                                                             │
│                                                                                                                                                                                                                                                      │
│ f:\pinokio\api\SD-Next.git\app\venv\lib\site-packages\diffusers\models\attention_processor.py:2477 in __call__                                                                                                                                       │
│                                                                                                                                                                                                                                                      │
│   2476 │   │   # TODO: add support for attn.scale when we move to Torch 2.1                                                                                                                                                                          │
│ ❱ 2477 │   │   hidden_states = F.scaled_dot_product_attention(                                                                                                                                                                                       │
│   2478 │   │   │   query, key, value, attn_mask=attention_mask, dropout_p=0.0, is_causal=False                                                                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
OutOfMemoryError: CUDA out of memory. Tried to allocate 1.22 GiB. GPU 0 has a total capacity of 8.00 GiB of which 219.00 MiB is free. Of the allocated memory 6.65 GiB is allocated by PyTorch, and 54.79 MiB is reserved by PyTorch but unallocated. If
 reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)        
21:02:56-921019 INFO     Processed: images=0 its=0.00 time=3.82 timers={'gc': 0.36, 'encode': 0.54, 'args': 0.56, 'process': 3.23} memory={'ram': {'used': 3.62, 'total': 19.97}, 'gpu': {'used': 2.89, 'total': 8.0}, 'retries': 1, 'oom': 1}

Backend

Diffusers

UI

Standard

Branch

Master

Model

StableDiffusion 1.5

Acknowledgements

[X] I have read the above and searched for existing issues
[X] I confirm that this is classified correctly and its not an extension issue

vladmandic / automatic