Forge BE question - switching between models

alexdisablo commented 1 month ago

Hi all. I'm not sure this is an issue, but the thing I've noticed is that loading of models (switching between them) is what's taking most of the time for me during image generation. I'm using the same model (checkpoint), same ClipSkip. But this model are loaded all the time, multiple times during generation, each generation. Is there a way to avoid that ? I mean I don't change models, why system is loading them each time from the scratch ? wtf ?

Example of 1 banch of model swtiching (have them during 1 image generation)

Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6963.8125
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  5780.255418777466
Moving model(s) has taken 0.66 seconds

0: 448x640 1 eyes, 25.6ms
Speed: 44.1ms preprocess, 25.6ms inference, 48.9ms postprocess per image at shape (1, 3, 448, 640)
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6781.39453125
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  3613.0398330688477
Moving model(s) has taken 13.33 seconds
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6886.24365234375
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  965.1571578979492
Moving model(s) has taken 29.50 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00,  1.40it/s]
To load target model AutoencoderKL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6937.57470703125
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  5754.017625808716
Moving model(s) has taken 0.74 seconds
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6937.80078125
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  1016.7142868041992
Moving model(s) has taken 2.39 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:13<00:00,  1.51it/s]
To load target model AutoencoderKL█████████████████████████████████████████████████████| 40/40 [01:24<00:00,  1.46it/s]
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6937.56640625
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  5754.009325027466
Moving model(s) has taken 0.37 seconds

0: 448x640 2 eyess, 26.8ms
Speed: 37.0ms preprocess, 26.8ms inference, 61.4ms postprocess per image at shape (1, 3, 448, 640)
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6774.3173828125
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  3605.9626846313477
Moving model(s) has taken 14.44 seconds
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6907.81103515625
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  986.7245407104492
Moving model(s) has taken 16.53 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00,  1.36it/s]
To load target model AutoencoderKL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6933.12255859375
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  5749.565477371216
Moving model(s) has taken 0.58 seconds
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6770.10791015625
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  3601.7532119750977
Moving model(s) has taken 6.04 seconds
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6903.6015625
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  982.5150680541992
Moving model(s) has taken 25.67 seconds
100%|████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00,  1.40it/s]
To load target model AutoencoderKL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6928.9130859375
[Memory Management] Model Memory (MB) =  159.55708122253418
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  5745.356004714966
Moving model(s) has taken 0.80 seconds
Total progress: 100%|██████████████████████████████████████████████████████████████████| 40/40 [02:49<00:00,  4.25s/it]
Cleanup minimal inference memory.██████████████████████████████████████████████████████| 40/40 [02:49<00:00,  1.46it/s]
tiled upscale: 100%|███████████████████████████████████████████████████████████████████| 35/35 [00:08<00:00,  3.98it/s]
To load target model SDXLClipModel
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6766.72705078125
[Memory Management] Model Memory (MB) =  2144.3546981811523
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  3598.3723526000977
Moving model(s) has taken 10.50 seconds
To load target model SDXL
Begin to load 1 model
[Memory Management] Current Free GPU Memory (MB) =  6927.48779296875
[Memory Management] Model Memory (MB) =  4897.086494445801
[Memory Management] Minimal Inference Memory (MB) =  1024.0
[Memory Management] Estimated Remaining GPU Memory (MB) =  1006.4012985229492
Moving model(s) has taken 29.45 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:13<00:00,  1.49it/s]
To load target model AutoencoderKL████████████████████

dongxiat commented 1 month ago

yep, me too...this happen after change Model Loading Behavior. I think its move out when finish and move in again when click Generate

alexdisablo commented 1 month ago

yep, me too...this happen after change Model Loading Behavior. I think its move out when finish and move in again when click Generate

The thing is that it's loading on and off not just 1 time during each generation. I press generate and it constantly changing models (like a lot of times, not just 1 initial load).

alexdisablo commented 1 month ago

@lllyasviel Is there a way to turn Model loading behavior off ? I'm working with the same checkpoint during the session, this feature kills the productivity dramatically. I understand it's useful when you work with multiple checkpoints, but I use the same one. And it keeps loading it from the scratch each generation, multiple times.

andy8992 commented 1 month ago

I never use Git but honestly.... this is driving me insane, any speed gain I get by using distillation methods, 4 steps, or really any time saving method has been pretty much negated by the constant model unloading and loading.

I think I saw the announcement about model loading but this really really is a huge bummer for me. Currently comfy keeping a model loaded is a massive boon compared to this recently.

I switched to forget but I'm bummed that i may need to switch bad if it's going to constantly take this long to render even if I just rendered a second ago, hell this even happens during a batch of multiple image.s

Please give us the option to keep models loaded, this is pretty important to e.

lllyasviel / stable-diffusion-webui-forge

Forge BE question - switching between models #1031