lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
7.06k stars 689 forks source link

[Bug]: Unable to load multiple checkpoints at the same time #445

Open player99963 opened 6 months ago

player99963 commented 6 months ago

Checklist

What happened?

Unable to load multiple checkpoints at the same time.

Steps to reproduce the problem

  1. Settings>Stable Diffusion
  2. 擷取

  3. Select two checkpoints for testing.
  4. Try switching between the two checkpoints and generating, or selecting two checkpoints in the XY plot and generating.
  5. After both checkpoints have been loaded, observe if you need to reload them when using the other checkpoints for generation.

What should have happened?

After both checkpoints have been loaded, normally it can use the loaded checkpoints to generate images immediately, but in SD forge it will need to reload the other checkpoint.

What browsers do you use to access the UI ?

microsoft edge

Sysinfo

sysinfo-2024-02-29-05-02.json

Console logs

Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: f0.0.16v1.8.0rc-latest-268-gb59deaa3
Commit hash: b59deaa382bf5c968419eff4559f7d06fc0e76e7
Launching Web UI with arguments:
Total VRAM 24564 MB, total RAM 65277 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4090 : native
Hint: your device supports --pin-shared-memory for potential speed improvements.
Hint: your device supports --cuda-malloc for potential speed improvements.
Hint: your device supports --cuda-stream for potential speed improvements.
VAE dtype: torch.bfloat16
CUDA Stream Activated:  False
Using pytorch cross attention
*** "Disable all extensions" option was set, will not load any extensions ***
Loading weights [14c3c10fe2] from D:\webui_forge_cu121_torch21\webui\models\Stable-diffusion\animeConfettiComrade_v2.safetensors
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 5.6s (prepare environment: 1.2s, import torch: 1.9s, import gradio: 0.5s, setup paths: 0.4s, other imports: 0.3s, load scripts: 0.4s, create ui: 0.2s, gradio launch: 0.7s).
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
left over keys: dict_keys(['alphas_cumprod', 'alphas_cumprod_prev', 'betas', 'log_one_minus_alphas_cumprod', 'posterior_log_variance_clipped', 'posterior_mean_coef1', 'posterior_mean_coef2', 'posterior_variance', 'sqrt_alphas_cumprod', 'sqrt_one_minus_alphas_cumprod', 'sqrt_recip_alphas_cumprod', 'sqrt_recipm1_alphas_cumprod'])
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  22993.99609375
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  19825.641395568848
Moving model(s) has taken 0.24 seconds
Model loaded in 3.6s (load weights from disk: 0.2s, forge instantiate config: 0.7s, forge load real models: 2.2s, load VAE: 0.2s, calculate empty prompt: 0.4s).
X/Y/Z plot will create 2 images on 1 2x1 grid. (Total steps to process: 10)
                                  Loading weights [f2e68c2a60] from D:\webui_forge_cu121_torch21\webui\models\Stable-diffusion\animeConfettiComrade_v10.safetensors
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  22913.265625
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  19744.910926818848
Moving model(s) has taken 0.54 seconds
Model loaded in 5.2s (unload existing model: 0.9s, load weights from disk: 0.1s, forge instantiate config: 1.1s, forge load real models: 2.4s, calculate empty prompt: 0.6s).
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  21145.28271484375
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  15224.19622039795
Moving model(s) has taken 1.19 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00,  8.41it/s]
To load target model AutoencoderKL████████████████████▌                                 | 5/10 [00:07<00:04,  1.08it/s]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  16139.46923828125
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  14955.912157058716
Moving model(s) has taken 0.04 seconds
Loading weights [14c3c10fe2] from D:\webui_forge_cu121_torch21\webui\models\Stable-diffusion\animeConfettiComrade_v2.safetensors
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
left over keys: dict_keys(['alphas_cumprod', 'alphas_cumprod_prev', 'betas', 'log_one_minus_alphas_cumprod', 'posterior_log_variance_clipped', 'posterior_mean_coef1', 'posterior_mean_coef2', 'posterior_variance', 'sqrt_alphas_cumprod', 'sqrt_one_minus_alphas_cumprod', 'sqrt_recip_alphas_cumprod', 'sqrt_recipm1_alphas_cumprod'])
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  22887.8740234375
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  19719.519325256348
Moving model(s) has taken 0.69 seconds
Model loaded in 6.0s (unload existing model: 1.8s, forge instantiate config: 1.2s, forge load real models: 2.0s, load VAE: 0.2s, calculate empty prompt: 0.7s).
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  21121.22021484375
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  15200.13372039795
Moving model(s) has taken 1.20 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 17.08it/s]
To load target model AutoencoderKL███████████████████████████████████████████████▎      | 9/10 [00:14<00:01,  1.36s/it]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  16131.46923828125
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  14947.912157058716
Moving model(s) has taken 0.04 seconds
Total progress: 100%|██████████████████████████████████████████████████████████████████| 10/10 [00:15<00:00,  1.53s/it]
X/Y/Z plot will create 2 images on 1 2x1 grid. (Total steps to process: 10)████████████| 10/10 [00:15<00:00,  1.36s/it]
                                  Loading weights [f2e68c2a60] from D:\webui_forge_cu121_torch21\webui\models\Stable-diffusion\animeConfettiComrade_v10.safetensors
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  22887.9365234375
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  19719.581825256348
Moving model(s) has taken 0.61 seconds
Model loaded in 6.3s (unload existing model: 2.3s, forge instantiate config: 1.1s, forge load real models: 2.2s, calculate empty prompt: 0.6s).
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  21121.28271484375
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  15200.19622039795
Moving model(s) has taken 0.67 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 24.12it/s]
To load target model AutoencoderKL█████████████▊                                        | 4/10 [00:07<00:08,  1.37s/it]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  16131.46923828125
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  14947.912157058716
Moving model(s) has taken 0.03 seconds
Loading weights [14c3c10fe2] from D:\webui_forge_cu121_torch21\webui\models\Stable-diffusion\animeConfettiComrade_v2.safetensors
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
left over keys: dict_keys(['alphas_cumprod', 'alphas_cumprod_prev', 'betas', 'log_one_minus_alphas_cumprod', 'posterior_log_variance_clipped', 'posterior_mean_coef1', 'posterior_mean_coef2', 'posterior_variance', 'sqrt_alphas_cumprod', 'sqrt_one_minus_alphas_cumprod', 'sqrt_recip_alphas_cumprod', 'sqrt_recipm1_alphas_cumprod'])
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  22887.8740234375
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  19719.519325256348
Moving model(s) has taken 0.27 seconds
Model loaded in 3.5s (unload existing model: 1.0s, forge instantiate config: 0.5s, forge load real models: 1.5s, load VAE: 0.2s, calculate empty prompt: 0.3s).
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  21121.22021484375
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  15200.13372039795
Moving model(s) has taken 0.77 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 24.16it/s]
To load target model AutoencoderKL███████████████████████████████████████████████▎      | 9/10 [00:11<00:00,  1.06it/s]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  16131.46923828125
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  14947.912157058716
Moving model(s) has taken 0.03 seconds
Total progress: 100%|██████████████████████████████████████████████████████████████████| 10/10 [00:12<00:00,  1.23s/it]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 10/10 [00:12<00:00,  1.06it/s]

Additional information

No response

philvitthotmail commented 5 months ago

Same issue here. In my case using --always-high-vram with my 4090 helped. I can run 2 SDXL models without one of them getting offloaded everytime.

player99963 commented 5 months ago

Same issue here. In my case using --always-high-vram with my 4090 helped. I can run 2 SDXL models without one of them getting offloaded everytime.

This method doesn't seem to work for me.

Bobo2929 commented 2 months ago

This method doesn't seem to work for me.

Also having this issue, having either two 1.5 checkpoints or one 1.5 and one XL checkpoint works perfectly well on A1111 and ComfyUI but not Forge. Tried this method and didn't work here either.