[Bug]: Every time the stable diffusion model is switched, the memory usage increases by 0.2GB.

Checklist

[x] The issue exists after disabling all extensions
[ ] The issue exists on a clean installation of webui
[ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
[x] The issue exists in the current version of the webui
[ ] The issue has not been reported before recently
[ ] The issue has been reported before but has not been fixed yet

What happened?

Sorry for my poor English, this is the translated text. Every time the stable diffusion model is switched, the memory usage increases by approximately 0.2GB,Especially with the Refiner function, using it a few times will cause OOM. Below is my memory usage when I just started 833fe63d7c722835d8704ec740d9ce2 After switching models 6 times. a2b4f4a14319d4397166f03c5b708c6

Steps to reproduce the problem

Every time I perform the model switching operation.

What should have happened?

I think the memory usage should stay around 0.4GB instead of increasing with each model switch. Additionally, I don't know a good way to release the memory other than restarting the web UI.

What browsers do you use to access the UI ?

Google Chrome

Sysinfo

sysinfo-2024-06-22-08-51.json

Console logs

Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
Version: f0.0.17v1.8.0rc-latest-287-g77bdb920
Commit hash: 77bdb9208d019e562f9f629647356dca2b2d5ef1
loading WD14-tagger reqs from D:\sd-webui-aki-v4.5\extensions\stable-diffusion-webui-wd14-tagger\requirements.txt
Checking WD14-tagger requirements.
Launching Web UI with arguments: --lowvram --theme light --xformers --no-half --no-half-vae --listen --enable-insecure-extension-access --no-gradio-queue
Arg --lowvram is removed in Forge.
Now memory management is fully automatic and you do not need any command flags.
Please just remove this flag.
In extreme cases, if you want to force previous lowvram/medvram behaviors, please use --always-offload-from-vram
Total VRAM 4096 MB, total RAM 16215 MB
Trying to enable lowvram mode because your GPU seems to have 4GB or less. If you don't want this use: --always-normal-vram
xformers version: 0.0.26.post1
Set vram state to: LOW_VRAM
Device: cuda:0 NVIDIA GeForce GTX 1650 Ti : native
VAE dtype: torch.float32
CUDA Stream Activated:  False
Using xformers cross attention
ControlNet preprocessor location: D:\sd-webui-aki-v4.5\models\ControlNetPreprocessor
Civitai Helper: Root Path is: D:\sd-webui-aki-v4.5
Civitai Helper: Get Custom Model Folder
Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu.
[-] ADetailer initialized. version: 24.6.0, num models: 11
dirname:  D:\sd-webui-aki-v4.5\localizations
localizations:  {'zh-Hans (Stable) [vladmandic]': 'D:\\sd-webui-aki-v4.5\\extensions\\stable-diffusion-webui-localization-zh_Hans\\localizations\\zh-Hans (Stable) [vladmandic].json', 'zh-Hans (Stable)': 'D:\\sd-webui-aki-v4.5\\extensions\\stable-diffusion-webui-localization-zh_Hans\\localizations\\zh-Hans (Stable).json', 'zh-Hans (Testing) [vladmandic]': 'D:\\sd-webui-aki-v4.5\\extensions\\stable-diffusion-webui-localization-zh_Hans\\localizations\\zh-Hans (Testing) [vladmandic].json', 'zh-Hans (Testing)': 'D:\\sd-webui-aki-v4.5\\extensions\\stable-diffusion-webui-localization-zh_Hans\\localizations\\zh-Hans (Testing).json'}
sd-webui-prompt-all-in-one background API service started successfully.
== WD14 tagger /gpu:0, uname_result(system='Windows', node='LAPTOP-HDTAAD2G', release='10', version='10.0.19041', machine='AMD64') ==
2024-06-22 16:26:08,377 - AnimateDiff - INFO - AnimateDiff Hooking i2i_batch
Loading weights [53dd398d16] from D:\sd-webui-aki-v4.5\models\Stable-diffusion\hajimi_cute.safetensors
model_type EPS
UNet ADM Dimension 0
Using xformers attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using xformers attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
Loading VAE weights specified in settings: D:\sd-webui-aki-v4.5\models\VAE\kl-f8-anime2.ckpt
To load target model SD1ClipModel
Begin to load 1 model
Moving model(s) has taken 0.01 seconds
Model loaded in 8.1s (load weights from disk: 0.4s, forge load real models: 6.0s, load VAE: 0.7s, calculate empty prompt: 0.9s).
2024-06-22 16:26:17,228 - ControlNet - INFO - ControlNet UI callback registered.
Civitai Helper: Settings:
Civitai Helper: max_size_preview: True
Civitai Helper: skip_nsfw_preview: False
Civitai Helper: open_url_with_js: True
Civitai Helper: proxy:
Civitai Helper: use civitai api key: False
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
IIB Database file has been successfully backed up to the backup folder.
Startup time: 112.8s (prepare environment: 35.2s, import torch: 27.0s, import gradio: 2.8s, setup paths: 1.5s, initialize shared: 0.4s, other imports: 2.6s, load scripts: 20.0s, scripts list_optimizers: 0.2s, create ui: 13.5s, gradio launch: 5.6s, add APIs: 0.8s, app_started_callback: 3.2s).
Loading weights [04e5257dd5] from D:\sd-webui-aki-v4.5\models\Stable-diffusion\t3_Ver14-fp16-no-ema.safetensors
model_type EPS
UNet ADM Dimension 0
Using xformers attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using xformers attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
loaded straight to GPU
To load target model BaseModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  1456.98681640625
[Memory Management] Model Memory (MB) =  0.00762939453125
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  432.97918701171875
Moving model(s) has taken 0.04 seconds
Loading VAE weights specified in settings: D:\sd-webui-aki-v4.5\models\VAE\kl-f8-anime2.ckpt
To load target model SD1ClipModel
Begin to load 1 model
Moving model(s) has taken 0.01 seconds
Model loaded in 7.7s (unload existing model: 0.6s, load weights from disk: 0.2s, forge load real models: 5.7s, load VAE: 0.7s, calculate empty prompt: 0.3s).
To load target model AutoencoderKL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  2944.6083984375
[Memory Management] Model Memory (MB) =  319.11416244506836
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  1601.4942359924316
Moving model(s) has taken 1.30 seconds
Loading weights [53dd398d16] from D:\sd-webui-aki-v4.5\models\Stable-diffusion\hajimi_cute.safetensors
model_type EPS
UNet ADM Dimension 0
Using xformers attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using xformers attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
loaded straight to GPU
To load target model BaseModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  1293.29296875
[Memory Management] Model Memory (MB) =  0.00762939453125
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  269.28533935546875
Moving model(s) has taken 0.04 seconds
Loading VAE weights specified in settings: D:\sd-webui-aki-v4.5\models\VAE\kl-f8-anime2.ckpt
To load target model SD1ClipModel
Begin to load 1 model
Moving model(s) has taken 0.01 seconds
Model loaded in 9.2s (unload existing model: 0.8s, load weights from disk: 0.2s, forge load real models: 7.3s, load VAE: 0.6s, calculate empty prompt: 0.3s).
To load target model AutoencoderKL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  2799.6318359375
[Memory Management] Model Memory (MB) =  319.11416244506836
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  1456.5176734924316
Moving model(s) has taken 1.24 seconds
Loading weights [ea848a8e05] from D:\sd-webui-aki-v4.5\models\Stable-diffusion\CounterfeitV30_25-fp16.safetensors
model_type EPS
UNet ADM Dimension 0
Using xformers attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using xformers attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
left over keys: dict_keys(['model_ema.decay', 'model_ema.num_updates'])
loaded straight to GPU
To load target model BaseModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  1148.31640625
[Memory Management] Model Memory (MB) =  0.00762939453125
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  124.30877685546875
Moving model(s) has taken 0.04 seconds
Loading VAE weights specified in settings: D:\sd-webui-aki-v4.5\models\VAE\kl-f8-anime2.ckpt
To load target model SD1ClipModel
Begin to load 1 model
Moving model(s) has taken 0.01 seconds
Model loaded in 7.5s (unload existing model: 0.9s, load weights from disk: 0.2s, forge load real models: 5.4s, load VAE: 0.7s, calculate empty prompt: 0.2s).
To load target model AutoencoderKL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  2646.6552734375
[Memory Management] Model Memory (MB) =  319.11416244506836
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  1303.5411109924316
Moving model(s) has taken 1.05 seconds
Loading weights [b3538e5a43] from D:\sd-webui-aki-v4.5\models\Stable-diffusion\PVCStyleModelMovable_v41.safetensors
model_type EPS
UNet ADM Dimension 0
Using xformers attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using xformers attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
loaded straight to GPU
To load target model BaseModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  995.33984375
[Memory Management] Model Memory (MB) =  0.00762939453125
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  -28.66778564453125
[Memory Management] Requested SYNC Preserved Memory (MB) =  0.0
[Memory Management] Parameters Loaded to SYNC Stream (MB) =  1639.406135559082
[Memory Management] Parameters Loaded to GPU (MB) =  0.0
Moving model(s) has taken 1.58 seconds
Loading VAE weights specified in settings: D:\sd-webui-aki-v4.5\models\VAE\kl-f8-anime2.ckpt
To load target model SD1ClipModel
Begin to load 1 model
Moving model(s) has taken 0.01 seconds
Model loaded in 9.7s (unload existing model: 0.7s, load weights from disk: 0.3s, forge load real models: 7.4s, load VAE: 0.8s, calculate empty prompt: 0.3s).
To load target model AutoencoderKL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  2501.6787109375
[Memory Management] Model Memory (MB) =  319.11416244506836
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  1158.5645484924316
Moving model(s) has taken 0.16 seconds
Loading weights [d01a68ae76] from D:\sd-webui-aki-v4.5\models\Stable-diffusion\油彩.safetensors
model_type EPS
UNet ADM Dimension 0
Using xformers attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using xformers attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
left over keys: dict_keys(['model_ema.decay', 'model_ema.num_updates'])
Loading VAE weights specified in settings: D:\sd-webui-aki-v4.5\models\VAE\kl-f8-anime2.ckpt
To load target model SD1ClipModel
Begin to load 1 model
Moving model(s) has taken 0.01 seconds
Model loaded in 8.3s (unload existing model: 0.7s, load weights from disk: 0.2s, forge load real models: 6.1s, load VAE: 0.9s, calculate empty prompt: 0.3s).
To load target model BaseModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  2356.8349609375
[Memory Management] Model Memory (MB) =  1639.4137649536133
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  -306.5788040161133
[Memory Management] Requested SYNC Preserved Memory (MB) =  1025.257661819458
[Memory Management] Parameters Loaded to SYNC Stream (MB) =  614.234619140625
[Memory Management] Parameters Loaded to GPU (MB) =  1025.171516418457
Moving model(s) has taken 0.52 seconds
To load target model AutoencoderKL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  2356.7021484375
[Memory Management] Model Memory (MB) =  319.11416244506836
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  1013.5879859924316
Moving model(s) has taken 0.69 seconds

Additional information

No response

lllyasviel / stable-diffusion-webui-forge