lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
8.39k stars 817 forks source link

The memory usage increases every time the model is switched and a new image is generated. #1598

Open s4130 opened 2 months ago

s4130 commented 2 months ago

我用一个sd1.5模型生成9张图片,内存占用基本维持在这个值: I used an SD1.5 model to generate 9 pictures, and the memory usage basically remained at this value: image 这是我用xyz plot选择了4个1.5模型生成图片后的内存占用: This is the memory usage after I used xyz plot to select 4 1.5 models to generate pictures: image 再次使用xyz plot生成,内存占用继续增加,最后因为c盘空间不足导致程序退出: Use xyz plot generation again, the memory usage continues to increase, and finally the program exits due to insufficient space on the c drive: [Program crashed with exit code 3221225477 (0xC0000005)] Below is an analysis of the exit code. This may not be accurate, please refer to it as needed! System exit code name: ACCESS_VIOLATION System exit code description: The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s. image 这是正常的吗? Is this normal?

s4130 commented 2 months ago

补一个xyz的控制台输出:

X/Y/Z plot will create 3 images on 1 3x1 grid. (Total steps to process: 60)
Loading Model: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\CounterfeitV30_25-fp16.safetensors', 'hash': '85a829de'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Current free memory is 2915.60 MB ... Current free memory is 2915.60 MB ... Current free memory is 2915.60 MB ... Current free memory is 2915.60 MB ... Unload model IntegratedAutoencoderKL Done.
StateDict Keys: {'unet': 686, 'vae': 248, 'text_encoder': 197, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float32, 'computation_dtype': torch.float32}
tag_autocomplete_helper: Old webui version or unrecognized model shape, using fallback for embedding completion.
Model loaded in 7.0s (unload existing model: 0.8s, forge model load: 6.2s).
All loaded to GPU.
Moving model(s) has taken 0.00 seconds
token_merging_ratio = 0.3
[Unload] Trying to free 5797.46 MB for cuda:0 with 0 models keep loaded ... Current free memory is 3235.60 MB ... Current free memory is 3235.60 MB ... Current free memory is 3235.60 MB ... Current free memory is 3235.60 MB ... Done.
[Memory Management] Target: UnetPatcher, Free GPU: 3235.60 MB, Model Require: 3278.81 MB, Inference Require: 1535.00 MB, Remaining: -1578.22 MB, Shared Swap Loaded (blocked method): 1971.88 MB, GPU Loaded: 1306.94 MB
Moving model(s) has taken 1.08 seconds
[Unload] Trying to free 2592.85 MB for cuda:0 with 0 models keep loaded ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Unload model KModel Done.
[Memory Management] Target: ModelPatcher, Free GPU: 3235.41 MB, Model Require: 319.11 MB, Inference Require: 1535.00 MB, Remaining: 1381.29 MB, All loaded to GPU.
Moving model(s) has taken 0.66 seconds
Loading Model: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\hajimi_cute.safetensors', 'hash': 'fc0f5024'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Unload model IntegratedAutoencoderKL Done.
StateDict Keys: {'unet': 686, 'vae': 248, 'text_encoder': 197, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float32, 'computation_dtype': torch.float32}
tag_autocomplete_helper: Old webui version or unrecognized model shape, using fallback for embedding completion.
Model loaded in 5.4s (unload existing model: 0.5s, forge model load: 4.9s).
All loaded to GPU.
Moving model(s) has taken 0.00 seconds
token_merging_ratio = 0.3
[Unload] Trying to free 5797.46 MB for cuda:0 with 0 models keep loaded ... Current free memory is 3235.47 MB ... Current free memory is 3235.47 MB ... Current free memory is 3235.47 MB ... Current free memory is 3235.47 MB ... Current free memory is 3235.47 MB ... Done.
[Memory Management] Target: UnetPatcher, Free GPU: 3235.47 MB, Model Require: 3278.81 MB, Inference Require: 1535.00 MB, Remaining: -1578.34 MB, Shared Swap Loaded (blocked method): 1971.88 MB, GPU Loaded: 1306.94 MB
Moving model(s) has taken 0.94 seconds
[Unload] Trying to free 2592.85 MB for cuda:0 with 0 models keep loaded ... Current free memory is 1919.99 MB ... Current free memory is 1919.99 MB ... Current free memory is 1919.99 MB ... Current free memory is 1919.99 MB ... Current free memory is 1919.99 MB ... Current free memory is 1919.99 MB ... Unload model KModel Done.
[Memory Management] Target: ModelPatcher, Free GPU: 3235.34 MB, Model Require: 319.11 MB, Inference Require: 1535.00 MB, Remaining: 1381.23 MB, All loaded to GPU.
Moving model(s) has taken 0.79 seconds
Loading Model: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\realm_v2.safetensors', 'hash': 'd1fa5bb4'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Current free memory is 2915.67 MB ... Current free memory is 2915.67 MB ... Current free memory is 2915.67 MB ... Current free memory is 2915.67 MB ... Current free memory is 2915.67 MB ... Current free memory is 2915.67 MB ... Unload model IntegratedAutoencoderKL Done.
StateDict Keys: {'unet': 686, 'vae': 248, 'text_encoder': 197, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float32, 'computation_dtype': torch.float32}
tag_autocomplete_helper: Old webui version or unrecognized model shape, using fallback for embedding completion.
Model loaded in 5.0s (unload existing model: 0.5s, forge model load: 4.5s).
All loaded to GPU.
Moving model(s) has taken 0.00 seconds
token_merging_ratio = 0.3
[Unload] Trying to free 5797.46 MB for cuda:0 with 0 models keep loaded ... Current free memory is 3235.41 MB ... Current free memory is 3235.41 MB ... Current free memory is 3235.41 MB ... Current free memory is 3235.41 MB ... Current free memory is 3235.41 MB ... Current free memory is 3235.41 MB ... Done.
[Memory Management] Target: UnetPatcher, Free GPU: 3235.41 MB, Model Require: 3278.81 MB, Inference Require: 1535.00 MB, Remaining: -1578.40 MB, Shared Swap Loaded (blocked method): 1971.88 MB, GPU Loaded: 1306.94 MB
Moving model(s) has taken 1.08 seconds
[Unload] Trying to free 2592.85 MB for cuda:0 with 0 models keep loaded ... Current free memory is 1919.93 MB ... Current free memory is 1919.93 MB ... Current free memory is 1919.93 MB ... Current free memory is 1919.93 MB ... Current free memory is 1919.93 MB ... Current free memory is 1919.93 MB ... Current free memory is 1919.93 MB ... Unload model KModel Done.
[Memory Management] Target: ModelPatcher, Free GPU: 3235.28 MB, Model Require: 319.11 MB, Inference Require: 1535.00 MB, Remaining: 1381.17 MB, All loaded to GPU.
Moving model(s) has taken 0.67 seconds
dongxiat commented 2 months ago

yep because model keep in RAM and when u switch mode, and new model continue load in to RAM (i think) ... so u need restart Forge when switch a new one... that is my solution by now.

s4130 commented 2 months ago

这是两个模型来回切换生成图片(A-B-A-B)的完整控制台日志:

Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
Version: f2.0.1v1.10.1-previous-495-g4f64f6daa
Commit hash: 4f64f6daa4582c8b5ddd5ccdb96a82fe86eaa91b
Launching Web UI with arguments: --disable-ipex-hijack --theme light --all-in-fp32 --no-half --xformers --listen --enable-insecure-extension-access --no-download-sd-model --skip-python-version-check
Total VRAM 4096 MB, total RAM 16215 MB
pytorch version: 2.3.1+cu118
xformers version: 0.0.27+cu118
Forcing FP32, if this improves things please report it.
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce GTX 1650 Ti : native
VAE dtype preferences: [torch.float32] -> torch.float32
CUDA Using Stream: False
Using xformers cross attention
Using xformers attention for VAE
ControlNet preprocessor location: D:\sd-webui-aki-v4.5\models\ControlNetPreprocessor
Civitai Helper: Root Path is: D:\sd-webui-aki-v4.5
Civitai Helper: Get Custom Model Folder
Tag Autocomplete: Could not locate model-keyword extension, Lora trigger word completion will be limited to those added through the extra networks menu.
[-] ADetailer initialized. version: 24.9.0-dev.0, num models: 11
dirname:  D:\sd-webui-aki-v4.5\localizations
localizations:  {'zh-Hans (Stable) [vladmandic]': 'D:\\sd-webui-aki-v4.5\\extensions\\stable-diffusion-webui-localization-zh_Hans\\localizations\\zh-Hans (Stable) [vladmandic].json', 'zh-Hans (Stable)': 'D:\\sd-webui-aki-v4.5\\extensions\\stable-diffusion-webui-localization-zh_Hans\\localizations\\zh-Hans (Stable).json', 'zh-Hans (Testing) [vladmandic]': 'D:\\sd-webui-aki-v4.5\\extensions\\stable-diffusion-webui-localization-zh_Hans\\localizations\\zh-Hans (Testing) [vladmandic].json', 'zh-Hans (Testing)': 'D:\\sd-webui-aki-v4.5\\extensions\\stable-diffusion-webui-localization-zh_Hans\\localizations\\zh-Hans (Testing).json'}
sd-webui-prompt-all-in-one background API service started successfully.
2024-09-01 12:58:40,392 - ControlNet - INFO - ControlNet UI callback registered.
Civitai Helper: Settings:
Civitai Helper: max_size_preview: True
Civitai Helper: skip_nsfw_preview: False
Civitai Helper: open_url_with_js: True
Civitai Helper: check_new_ver_exist_in_all_folder: True
Civitai Helper: proxy:
Civitai Helper: use civitai api key: False
Model selected: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\realisticVisionV60B1_v51VAE.safetensors', 'hash': 'a0f13c83'}, 'additional_modules': ['D:\\sd-webui-aki-v4.5\\models\\VAE\\animevae.pt'], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
IIB Database file has been successfully backed up to the backup folder.
Startup time: 70.3s (prepare environment: 20.8s, import torch: 21.8s, initialize shared: 0.3s, other imports: 1.0s, load scripts: 7.8s, create ui: 8.9s, gradio launch: 9.4s, app_started_callback: 0.2s).
Environment vars changed: {'stream': False, 'inference_memory': 1535.0, 'pin_shared_memory': False}
[GPU Setting] You will use 62.52% GPU memory (2560.00 MB) to load weights, and use 37.48% GPU memory (1535.00 MB) to do matrix computation.
Model selected: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\realisticVisionV60B1_v51VAE.safetensors', 'hash': 'a0f13c83'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Model selected: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\CounterfeitV30_25-fp16.safetensors', 'hash': '85a829de'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Loading Model: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\CounterfeitV30_25-fp16.safetensors', 'hash': '85a829de'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Done.
StateDict Keys: {'unet': 686, 'vae': 248, 'text_encoder': 197, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float32, 'computation_dtype': torch.float32}
tag_autocomplete_helper: Old webui version or unrecognized model shape, using fallback for embedding completion.
Model loaded in 2.7s (unload existing model: 0.4s, forge model load: 2.3s).
All loaded to GPU.
Moving model(s) has taken 0.00 seconds
token_merging_ratio = 0.3
[Unload] Trying to free 5797.46 MB for cuda:0 with 0 models keep loaded ... Current free memory is 3271.72 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 3271.72 MB, Model Require: 3278.81 MB, Previously Loaded: 0.00 MB, Inference Require: 1535.00 MB, Remaining: -1542.09 MB, CPU Swap Loaded (blocked method): 1971.88 MB, GPU Loaded: 1306.94 MB
Moving model(s) has taken 0.60 seconds
[Unload] Trying to free 2592.85 MB for cuda:0 with 0 models keep loaded ... Current free memory is 1920.60 MB ... Current free memory is 1920.60 MB ... Unload model KModel Done.
[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 3235.41 MB, Model Require: 319.11 MB, Previously Loaded: 0.00 MB, Inference Require: 1535.00 MB, Remaining: 1381.29 MB, All loaded to GPU.
Moving model(s) has taken 0.70 seconds
Model selected: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\realm_v2.safetensors', 'hash': 'd1fa5bb4'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Loading Model: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\realm_v2.safetensors', 'hash': 'd1fa5bb4'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Unload model IntegratedAutoencoderKL Done.
StateDict Keys: {'unet': 686, 'vae': 248, 'text_encoder': 197, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float32, 'computation_dtype': torch.float32}
tag_autocomplete_helper: Old webui version or unrecognized model shape, using fallback for embedding completion.
Model loaded in 11.2s (unload existing model: 0.7s, forge model load: 10.5s).
All loaded to GPU.
Moving model(s) has taken 0.01 seconds
token_merging_ratio = 0.3
[Unload] Trying to free 5797.46 MB for cuda:0 with 0 models keep loaded ... Current free memory is 3235.60 MB ... Current free memory is 3235.60 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 3235.60 MB, Model Require: 3278.81 MB, Previously Loaded: 0.00 MB, Inference Require: 1535.00 MB, Remaining: -1578.22 MB, CPU Swap Loaded (blocked method): 1971.88 MB, GPU Loaded: 1306.94 MB
Moving model(s) has taken 1.13 seconds
[Unload] Trying to free 2592.85 MB for cuda:0 with 0 models keep loaded ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Unload model KModel Done.
[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 3235.41 MB, Model Require: 319.11 MB, Previously Loaded: 0.00 MB, Inference Require: 1535.00 MB, Remaining: 1381.29 MB, All loaded to GPU.
Moving model(s) has taken 0.72 seconds
Model selected: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\CounterfeitV30_25-fp16.safetensors', 'hash': '85a829de'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Loading Model: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\CounterfeitV30_25-fp16.safetensors', 'hash': '85a829de'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Unload model IntegratedAutoencoderKL Done.
StateDict Keys: {'unet': 686, 'vae': 248, 'text_encoder': 197, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float32, 'computation_dtype': torch.float32}
tag_autocomplete_helper: Old webui version or unrecognized model shape, using fallback for embedding completion.
Model loaded in 5.3s (unload existing model: 3.8s, forge model load: 1.5s).
All loaded to GPU.
Moving model(s) has taken 0.01 seconds
token_merging_ratio = 0.3
[Unload] Trying to free 5797.46 MB for cuda:0 with 0 models keep loaded ... Current free memory is 3235.60 MB ... Current free memory is 3235.60 MB ... Current free memory is 3235.60 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 3235.60 MB, Model Require: 3278.81 MB, Previously Loaded: 0.00 MB, Inference Require: 1535.00 MB, Remaining: -1578.22 MB, CPU Swap Loaded (blocked method): 1971.88 MB, GPU Loaded: 1306.94 MB
Moving model(s) has taken 0.70 seconds
[Unload] Trying to free 2592.85 MB for cuda:0 with 0 models keep loaded ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Unload model KModel Done.
[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 3235.41 MB, Model Require: 319.11 MB, Previously Loaded: 0.00 MB, Inference Require: 1535.00 MB, Remaining: 1381.29 MB, All loaded to GPU.
Moving model(s) has taken 0.76 seconds
Model selected: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\realm_v2.safetensors', 'hash': 'd1fa5bb4'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Loading Model: {'checkpoint_info': {'filename': 'D:\\sd-webui-aki-v4.5\\models\\Stable-diffusion\\realm_v2.safetensors', 'hash': 'd1fa5bb4'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Current free memory is 2915.73 MB ... Unload model IntegratedAutoencoderKL Done.
StateDict Keys: {'unet': 686, 'vae': 248, 'text_encoder': 197, 'ignore': 0}
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
K-Model Created: {'storage_dtype': torch.float32, 'computation_dtype': torch.float32}
tag_autocomplete_helper: Old webui version or unrecognized model shape, using fallback for embedding completion.
Model loaded in 2.8s (unload existing model: 0.8s, forge model load: 2.0s).
All loaded to GPU.
Moving model(s) has taken 0.01 seconds
token_merging_ratio = 0.3
[Unload] Trying to free 5797.46 MB for cuda:0 with 0 models keep loaded ... Current free memory is 3235.60 MB ... Current free memory is 3235.60 MB ... Current free memory is 3235.60 MB ... Current free memory is 3235.60 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 3235.60 MB, Model Require: 3278.81 MB, Previously Loaded: 0.00 MB, Inference Require: 1535.00 MB, Remaining: -1578.22 MB, CPU Swap Loaded (blocked method): 1971.88 MB, GPU Loaded: 1306.94 MB
Moving model(s) has taken 0.58 seconds
[Unload] Trying to free 2592.85 MB for cuda:0 with 0 models keep loaded ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Current free memory is 1920.05 MB ... Unload model KModel Done.
[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 3235.41 MB, Model Require: 319.11 MB, Previously Loaded: 0.00 MB, Inference Require: 1535.00 MB, Remaining: 1381.29 MB, All loaded to GPU.
Moving model(s) has taken 0.72 seconds

同样,每次切换模型内存占用依旧会增加,即使这个模型之前加载过.