anapnoe / stable-diffusion-webui-ux

Stable Diffusion web UI UX
GNU Affero General Public License v3.0
991 stars 58 forks source link

[Bug]: Massively slower generation and higher vram usage in RTX 2060 Mobile compared to original webui #52

Closed hollowstrawberry closed 1 year ago

hollowstrawberry commented 1 year ago

Is there an existing issue for this?

What happened?

I am experiencing deal-breaking performance differences between the original webui and this fork. Here is my setup:

I ran the same image parameters on both clients multiple times. Results were:

stable-diffusion-webui:

stable-diffusion-webui-ux:

In one instance, stable-diffusion-webui-ux capped out the VRAM and got stuck for over 2 minutes trying to hires fix. Time taken: 3m 11.70s

Steps to reproduce the problem

  1. Use described setup in the previous section
  2. Open stable-diffusion-webui and Ctrl+F5 the page
  3. Go to image browser, send the latest image parameters to txt2img
  4. Generate an image to make sure everything is loaded
  5. Generate the same image to benchmark, then the same image with hires fix to benchmark (Result A)
  6. Open stable-diffusion-webui-ux and Ctrl+F5 the page
  7. Repeat steps 2, 3 and 4 (Result B)

Result A: Generation takes 9-10 seconds, hires fix takes about 1:01 minute, VRAM jumps up and down at a healthy pace

Result B: Generation takes 12-13 seconds, hires fix takes about 1:40 minute, VRAM sometimes caps out at 6 GB VRAM and hires fix gets stuck for several minutes

What should have happened?

Generation time and vram usage should be the exact same within margin of error, because there should be no reason for one to use the graphics card differently from the other.

Commit where the problem happens

2ac62ce241396939c387a221e20b0a7a8c399b6f (stable-diffusion-webui-ux)

22bcc7be428c94e9408f589966c2040187245d81 (stable-diffusion-webui)

What platforms do you use to access the UI ?

Windows

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

set COMMANDLINE_ARGS=--medvram --xformers --listen --no-half-vae --allow-code --enable-insecure-extension-access --ckpt-dir ..\\sd-models\\Checkpoint --vae-dir ..\\sd-models\\VAE --lora-dir ..\\sd-models\\Lora --embeddings-dir ..\\sd-models\\TI
set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.9,max_split_size_mb:512

List of extensions

Extension | URL | Version | Update -- | -- | -- | --   Extension URL Version Update a1111-sd-webui-tagcomplete https://github.com/DominikDoom/a1111-sd-webui-tagcomplete.git 223abf54 (Wed Apr 5 11:05:44 2023) unknown multidiffusion-upscaler-for-automatic1111 https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111.git 70ca3c77 (Wed Apr 5 10:57:07 2023) unknown sd-dynamic-prompts https://github.com/adieyal/sd-dynamic-prompts.git b16480e3 (Wed Apr 12 08:30:59 2023) unknown sd-webui-controlnet https://github.com/Mikubill/sd-webui-controlnet.git e1885108 (Wed Apr 12 03:24:32 2023) unknown sd-webui-model-converter https://github.com/Akegarasu/sd-webui-model-converter.git d19e2816 (Sun Mar 26 06:36:49 2023) unknown stable-diffusion-webui-images-browser https://github.com/AlUlkesh/stable-diffusion-webui-images-browser.git 1d5c2e75 (Tue Mar 28 13:19:52 2023) unknown ultimate-upscale-for-automatic1111 https://github.com/Coyote-A/ultimate-upscale-for-automatic1111.git 0a3d03a4 (Tue Feb 7 06:07:23 2023) unknown LDSR [built-in](http://localhost:7860/) Lora [built-in](http://localhost:7860/) ScuNET [built-in](http://localhost:7860/) SwinIR [built-in](http://localhost:7860/) prompt-bracket-checker [built-in](http://localhost:7860/) sd_theme_editor [built-in](http://localhost:7860/) ### Console logs ```Shell stable-diffusion-webui: Already up to date. venv "D:\stable-diffusion-webui\venv\Scripts\Python.exe" Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] Commit hash: 22bcc7be428c94e9408f589966c2040187245d81 Installing requirements for Web UI Launching Web UI with arguments: --medvram --xformers --listen --no-half-vae --allow-code --enable-insecure-extension-access --ckpt-dir ..\sd-models\Checkpoint --vae-dir ..\sd-models\VAE --lora-dir ..\sd-models\Lora --embeddings-dir ..\sd-models\TI Additional Network extension not installed, Only hijack built-in lora LoCon Extension hijack built-in lora successfully Loading weights [c4506f615d] from ..\sd-models\Checkpoint\7thHeavenMix.safetensors Creating model from config: D:\stable-diffusion-webui\configs\v1-inference.yaml LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 859.52 M params. Loading VAE weights specified in settings: ..\sd-models\VAE\blessed2.vae.pt Applying xformers cross attention optimization. Textual inversion embeddings loaded(29): bad-hands-5, EasyNegative, ti-assassin-yor-forger, ti-marin-kitagawa, ti-power, ti-yor-forger, ti-arm-covering-breasts, ti-arms-covering-breasts, ti-boobjob, ti-bound-missionary, ti-cowgirl-position, ti-hands-covering-breasts, ti-oral-pov, ti-oral-sideview, ti-2-hoshimachi-suisei, ti-darknesss-laplus, ti-hoshimachi-suisei, ti-houshou-marine, ti-inugami-korone, ti-nekomata-okayu, ti-ninomae-inanis, ti-oozora-subaru, ti-sakamata-chloe, ti-shishiro-botan, ti-uruha-rushia-black, ti-uruha-rushia-pink, ti-uruha-rushia-school, ti-uruha-rushia, ti-shylily Model loaded in 2.5s (load weights from disk: 0.3s, create model: 0.6s, apply weights to model: 0.7s, apply half(): 0.4s, load VAE: 0.4s). Running on local URL: http://0.0.0.0:7860 To create a public link, set `share=True` in `launch()`. Startup time: 19.2s (import torch: 2.0s, import gradio: 1.4s, import ldm: 0.7s, other imports: 1.1s, setup codeformer: 0.2s, load scripts: 2.2s, load SD checkpoint: 2.7s, create ui: 4.7s, gradio launch: 4.2s). locon load lora method 100%|██████████████████████████████████████████████| 30/30 [00:11<00:00, 2.51it/s] Total progress: 30it [00:25, 1.17it/s] 100%|██████████████████████████████████████████████| 30/30 [00:07<00:00, 4.12it/s] 100%|██████████████████████████████████████████████| 10/10 [00:16<00:00, 1.65s/it] Total progress: 100%|██████████████████████████████| 40/40 [01:05<00:00, 1.63s/it] stable-diffusion-webui-ux: venv "D:\stable-diffusion-webui-ux\venv\Scripts\Python.exe" Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] Commit hash: 2ac62ce241396939c387a221e20b0a7a8c399b6f Installing requirements for Web UI Installing sd-dynamic-prompts requirements.txt Launching Web UI with arguments: --medvram --xformers --listen --no-half-vae --allow-code --enable-insecure-extension-access --ckpt-dir ..\sd-models\Checkpoint --vae-dir ..\sd-models\VAE --lora-dir ..\sd-models\Lora --embeddings-dir ..\sd-models\TI Loading weights [c4506f615d] from ..\sd-models\Checkpoint\7thHeavenMix.safetensors Creating model from config: D:\stable-diffusion-webui-ux\configs\v1-inference.yaml LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 859.52 M params. Loading VAE weights specified in settings: ..\sd-models\VAE\blessed2.vae.pt Applying xformers cross attention optimization. Textual inversion embeddings loaded(29): bad-hands-5, EasyNegative, ti-assassin-yor-forger, ti-marin-kitagawa, ti-power, ti-yor-forger, ti-arm-covering-breasts, ti-arms-covering-breasts, ti-boobjob, ti-bound-missionary, ti-cowgirl-position, ti-hands-covering-breasts, ti-oral-pov, ti-oral-sideview, ti-2-hoshimachi-suisei, ti-darknesss-laplus, ti-hoshimachi-suisei, ti-houshou-marine, ti-inugami-korone, ti-nekomata-okayu, ti-ninomae-inanis, ti-oozora-subaru, ti-sakamata-chloe, ti-shishiro-botan, ti-uruha-rushia-black, ti-uruha-rushia-pink, ti-uruha-rushia-school, ti-uruha-rushia, ti-shylily Model loaded in 2.0s (create model: 0.6s, apply weights to model: 0.6s, apply half(): 0.3s, load VAE: 0.3s). Running on local URL: http://0.0.0.0:7860 To create a public link, set `share=True` in `launch()`. Startup time: 18.5s (import gradio: 2.5s, import ldm: 0.9s, other imports: 1.8s, list extensions: 0.8s, load scripts: 2.0s, load SD checkpoint: 2.2s, create ui: 3.6s, gradio launch: 4.7s). 100%|██████████████████████████████████████████████| 30/30 [00:14<00:00, 2.13it/s] Total progress: 100%|██████████████████████████████| 30/30 [00:16<00:00, 1.81it/s] 100%|██████████████████████████████████████████████| 30/30 [00:10<00:00, 2.82it/s] Total progress: 100%|██████████████████████████████| 30/30 [00:11<00:00, 2.60it/s] 100%|██████████████████████████████████████████████| 30/30 [00:10<00:00, 2.82it/s] 100%|██████████████████████████████████████████████| 10/10 [02:15<00:00, 13.55s/it] Total progress: 100%|██████████████████████████████| 40/40 [03:13<00:00, 4.84s/it] Total progress: 100%|██████████████████████████████| 40/40 [03:13<00:00, 2.88s/it] ``` ### Additional information _No response_
anapnoe commented 1 year ago

the ui-ux uses this commit a9eab23 from the original repository the performance is the same I haven't upgrade to the latest yet since I consider it unstable but soon I will upload the latest commit to dev branch so people can try it out, it will not be on the main branch as it still has a lot of issues