Models caching does not work (sd_checkpoints_limit)

psydok commented 1 month ago

Tested setting sd_checkpoints_keep_in_cpu: false, sd_checkpoints_limit: 3, sd_checkpoint_cache: 3. Nothing worked. Every request for a new model is long.

psydok commented 1 month ago

In automatic1111 it worked. Why was it deleted here, but the fields were left?

Hugs288 commented 1 month ago

i think one of the updates broke model caching, it used to be perfect before but now sometimes after i dont generate for a couple mins or run generate after running hires fix it tries to load the whole model from disk again, pretty randomly aswell, idk.

s4130 commented 1 month ago

I suspect switching models is causing RAM usage to keep increasing, probably because these settings aren't taking effect.

psydok commented 1 month ago

I also noticed that if you send {"override_settings":{"sd_model_checkpoint": "flux1-dev-bnb-nf4-v2.safetensors", "forge_preset": "flux", "forge_additional_modules": []}}, but this model is used by default, then Forge still restarts loading checkpoints because of this inference longer than expected.

altoiddealer commented 1 month ago

I also noticed that if you send {"override_settings":{"sd_model_checkpoint": "flux1-dev-bnb-nf4-v2.safetensors", "forge_preset": "flux", "forge_additional_modules": []}}, but this model is used by default, then Forge still restarts loading checkpoints because of this inference longer than expected.

The way override_settings works, is that if a provided settings value is identical to the current stored value, then it is ignored.

With sd_model_checkpoint... you can "set" the value to a wide variety of accepted "checkpoint aliases", and I'm not quite sure at what point this happens, but the value will subsequently change to the "title" returned by the sd-models API endpoint.

So what is happening is you are passing the model_name value which is a valid value, but it is not equal to the current value so it is not ignoring it, it is setting it, and so model params are refreshing, etc.

I found a way to resolve this... will be pushing a PR soon.

altoiddealer commented 1 month ago

@psydok please check out this PR here which resolves the issue you mentioned in your comment here (Not your "main issue"). Works for me - if you get a chance to try it out, please leave a comment there. Thank you.

https://github.com/lllyasviel/stable-diffusion-webui-forge/pull/2181

psydok commented 1 month ago

@altoiddealer Okay, I'll look at PR and test it tomorrow. Thank you for fix!

UPD: It's okay! Thanks! But issue will not close. I would like to restore work of these parameters in Forge: sd_checkpoints_keep_in_cpu: false, sd_checkpoints_limit: 3, sd_checkpoint_cache: 3.

psydok commented 1 month ago

I found commit where fatal changes were made. But the name of the commit does not give any information about why it was done. @lllyasviel @DenOfEquity Does anyone know if this happened by accident for debug or if there was some kind of mistake that caused something not to work?

DenOfEquity commented 1 month ago

That's a very old commit, before I was using Forge. Possibly even before Forge was public? Probably caused (or had high potential to cause) issues after backend reworks by complicating memory management, but that's just speculation. Since then the backend is reworked again, with the Flux update. There's quite a few relics in the code. A good way to check if settings are used is to search the repo: sd_checkpoints_keep_in_cpu, sd_checkpoints_limit, sd_checkpoint_cache are not referenced anywhere, not even in commented out code.

psydok commented 2 weeks ago

@DenOfEquity Thanks for explanation!

Another question has formed in my mind. I'm trying to reconstruct logic, but things have changed lot in forge and there are lot of wrapper classes. Could you please tell me, maybe you understand what class should be stored in memory to store both flux (~12gb) and some version of sdxl (~8gb) (for example)? To be able to quickly switch between models. I thought I needed to add the --sd-checkpoint-limit logic to memory_management.py. But I got confused by count of class reinitializations. They seem to be reinitialized all time, even if model_data.forge_hash matches (False - doesn't affect anything). Either problem is that I'm debugging on very weak gpu (2gb).

what class should be saved and can it be moved to cpu and back somehow gracefully? I don't think it will work without global changes...

psydok commented 2 weeks ago

I noticed that if you add --always-gpu when starting forge, it seems like checkpoint change doesn't take as long. I don't understand why though? The memory manager clears everything anyway, it seems.

DenOfEquity commented 2 weeks ago

I only know what I know as a result of poking around, so my understanding could be completely wrong. Models are stored in 3 classes: JointTextEncoder, KModel, IntegratedAutoencoderKL. The latter 2 seem to be reused/reinitialised when a new model in loaded. The first doesn't get reused, potentially leading to the memory leak / excess Committed memory problem some users have. I'd say Forge is fundamentally not designed to keep multiple models loaded anymore. (With modern models barely fitting into typical consumer hardware anyway, it's likely just too much extra complexity for too low value.)

lllyasviel / stable-diffusion-webui-forge

Models caching does not work (sd_checkpoints_limit) #2176