lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
7.72k stars 741 forks source link

[Bug]: Forge UI works extremelly slow if Silly Tavern is also running at the same time #741

Open guispfilho opened 4 months ago

guispfilho commented 4 months ago

Checklist

What happened?

I just installed Forge UI, and it's running smoothly. at it stays like that if I run Oobabooga text UI at the same time. How ever, if I run Silly Tavern at the same time, the time to generate a single image goes from 10 seconds to 10-15 minutes. I had to alter the COMMANDLINE_ARGS argument in the 'webui-user.bat' file because Forge's API need to me enabled to be accessed by Silly Tavern, and because Oobabooga also uses port 7860, só I had to change Forge's port for a random on, i selected 7862 for no particular reason: set COMMANDLINE_ARGS= --api --port 7862

Edit: It seems that it also gets extremly slow speeds when Oobabooga is running, dispite Silly Tavern is not running....

Steps to reproduce the problem

1- Run Forge UI 2- Generate an image directly through Forge UI in seconds 3- Run Silly Taverns 4- DON'T connect Silly Tavern and Forge via http://localhost:7860 5- Generate a new image directly through Forge UI, without altering any settings, in seconds 6- CONNECT Silly Tavern and Forge via http://localhost:7860 7- Generate a new image directly through Forge UI, without altering any settings, takes 10-15 minutes

What should have happened?

I assume that Image generation should have kept almost the same time, maybe a few seconds slower, but not 10-15 minutes for a single image, but it seems that something is wrong with the local connection between ST and Forge.

What browsers do you use to access the UI ?

Microsoft Edge

Sysinfo

sysinfo-2024-05-15-04-20.json

Console logs

venv "D:\app\stable-diffusion-webui-forge\venv\Scripts\Python.exe"
Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
Version: f0.0.17v1.8.0rc-latest-276-g29be1da7
Commit hash: 29be1da7cf2b5dccfc70fbdd33eb35c56a31ffb7
Launching Web UI with arguments: --api --port 7862
Total VRAM 12282 MB, total RAM 31898 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4070 : native
Hint: your device supports --pin-shared-memory for potential speed improvements.
Hint: your device supports --cuda-malloc for potential speed improvements.
Hint: your device supports --cuda-stream for potential speed improvements.
VAE dtype: torch.bfloat16
CUDA Stream Activated:  False
Using pytorch cross attention
ControlNet preprocessor location: D:\app\stable-diffusion-webui-forge\models\ControlNetPreprocessor
Loading weights [8f463bd4ca] from D:\app\stable-diffusion-webui-forge\models\Stable-diffusion\smoothcutsLightning33STEPS_v01Lightning3steps.safetensors
2024-05-15 01:10:15,848 - ControlNet - INFO - ControlNet UI callback registered.
Running on local URL:  http://127.0.0.1:7862

To create a public link, set `share=True` in `launch()`.
model_type EPS
UNet ADM Dimension 2816
Startup time: 10.9s (prepare environment: 2.7s, import torch: 3.0s, import gradio: 0.7s, setup paths: 0.8s, other imports: 0.6s, load scripts: 1.2s, create ui: 0.5s, gradio launch: 0.3s, add APIs: 1.0s).
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
left over keys: dict_keys(['denoiser.sigmas'])
Loading VAE weights specified in settings: D:\app\stable-diffusion-webui-forge\models\VAE\sdxl_vae.safetensors
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  11093.99609375
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  7925.641395568848
Moving model(s) has taken 0.70 seconds
Model loaded in 6.3s (load weights from disk: 0.6s, forge load real models: 4.3s, load VAE: 0.5s, calculate empty prompt: 0.9s).
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  9282.69482421875
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  3361.608329772949
Moving model(s) has taken 1.84 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00,  1.13s/it]
To load target model AutoencoderKL███████████████████████████████████████████████████████| 4/4 [00:03<00:00,  1.03it/s]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6016.22412109375
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  4832.667039871216
Moving model(s) has taken 0.88 seconds
Total progress: 100%|████████████████████████████████████████████████████████████████████| 4/4 [00:06<00:00,  1.66s/it]
100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00,  1.18s/it]
To load target model AutoencoderKL███████████████████████████████████████████████████████| 4/4 [00:03<00:00,  1.05s/it]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  10975.248046875
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  9791.690965652466
Moving model(s) has taken 1.94 seconds
Total progress: 100%|████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00,  1.96s/it]
To load target model SDXLClipModel███████████████████████████████████████████████████████| 4/4 [00:07<00:00,  1.05s/it]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  10833.19287109375
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  7664.838172912598
Moving model(s) has taken 0.74 seconds
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  9062.5341796875
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  3141.447685241699
Moving model(s) has taken 2.29 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [05:54<00:00, 88.70s/it]
Memory cleanup has taken 2.66 seconds████████████████████████████████████████████████████| 4/4 [05:11<00:00, 86.96s/it]
Total progress: 100%|████████████████████████████████████████████████████████████████████| 4/4 [05:15<00:00, 78.89s/it]
To load target model SDXL████████████████████████████████████████████████████████████████| 4/4 [05:15<00:00, 86.96s/it]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  10818.92919921875
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  4897.842704772949
Moving model(s) has taken 1.83 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:02<00:00,  1.40it/s]
Memory cleanup has taken 1.85 seconds████████████████████████████████████████████████████| 4/4 [00:02<00:00,  1.45it/s]
Total progress: 100%|████████████████████████████████████████████████████████████████████| 4/4 [00:05<00:00,  1.29s/it]
To load target model SDXL████████████████████████████████████████████████████████████████| 4/4 [00:05<00:00,  1.45it/s]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  10818.6796875
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  4897.593193054199
Moving model(s) has taken 1.84 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:02<00:00,  1.81it/s]
Memory cleanup has taken 1.70 seconds████████████████████████████████████████████████████| 4/4 [00:01<00:00,  1.99it/s]
Total progress: 100%|████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00,  1.08s/it]
Total progress: 100%|████████████████████████████████████████████████████████████████████| 4/4 [00:04<00:00,  1.99it/s]

Additional information

No response

CasanovaSan commented 4 months ago

weird, this doesn't happen with me are you using anaconda or something else rather than plain old python and nodejs?

In my case when i have both of them up, an image gen takes around 3 minutes (ponyxl model, refiner + highres+ adetailer), and it takes 3 minutes without silly as well

Also, i have an idea for a temporary fix for you, install the forge package https://github.com/lllyasviel/stable-diffusion-webui-forge/releases/download/latest/webui_forge_cu121_torch21.7z if this as well is slow with silly open, then i have no idea what you can do