Open altoiddealer opened 1 day ago
Actually, I think I was hallucinating the forge_additional_modules
working via override_settings
.
It's not.
But the modules are setting when posting to the /sdapi/v1/options
endpoint
One other oddity I am going to lump in with this issue.
If I include a list of forge_additional_modules
when using the /sdapi/v1/options
endpoint, I get a Cuda OOM on the next API generation.
If I include an empty list for forge_additional_modules
when using the /sdapi/v1/options
endpoint... but then select the same modules in the UI one by one - I do NOT get Cuda OOM on the next API generation. (But I do experience the Issue in OP).
seems like
GPU Weights (MB) = ALL VRAM - forge_inference_memory
seems like GPU Weights (MB) = ALL VRAM -
forge_inference_memory
Thanks! When I made this Issue, I thought that override_settings
was working via API, but apparently not.
I'll play around with forge_inference_memory
tomorrow via the /options/ endpoint and see if I can work it out
seems like GPU Weights (MB) = ALL VRAM -
forge_inference_memory
Thanks! When I made this Issue, I thought that
override_settings
was working via API, but apparently not.I'll play around with
forge_inference_memory
tomorrow via the /options/ endpoint and see if I can work it out
FYI, I tested several times and switched between flux and sd. Everything seems to work well via sdapi/v1/options
and immediately after, sdapi/v1/txt2img
。
However, if I refresh (Ctrl/Cmd + F5) the web UI page before sending sdapi/v1/txt2img
to check if the parameters are set properly (confirming that sdapi/v1/options worked), then send sdapi/v1/txt2img
, an error occurs. Seems that refresh_memory_management_settings
function is triggered, resetting the environment variables to: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}
. This is followed by a [Low VRAM Warning]
, and then an error indicating that Checkpoint FLUX/flux1-dev-Q4_0.gguf(selected in sdapi/v1/options
) was not found, resulting in the fallback to a default ckpt :(
# sample
# flux
option_payload_flux = {
"sd_model_checkpoint": "FLUX/flux1-dev-Q4_0.gguf",
"forge_unet_storage_dtype": "Automatic (fp16 LoRA)",
"forge_additional_modules": ["/data/stable-diffusion-webui-forge/models/VAE/ae.safetensors",
"/data/stable-diffusion-webui-forge/models/text_encoder/clip_l.safetensors",
"/data/stable-diffusion-webui-forge/models/text_encoder/t5xxl_fp8_e4m3fn.safetensors"],
"forge_inference_memory": 4096,
"forge_preset": "flux",
}
# sd
option_payload_sd = {
"sd_model_checkpoint": "majicmixRealistic_v7.safetensors",
"forge_unet_storage_dtype": "Automatic",
"forge_additional_modules": [],
"forge_inference_memory": 10240,
"forge_preset": "sd",
}```
seems like GPU Weights (MB) = ALL VRAM -
forge_inference_memory
# flux
option_payload_flux = {
"forge_inference_memory": 4096
}
# sd
option_payload_sd = {
"forge_inference_memory": 10240
}
I'm struggling a bit to understand the logic of this... how much VRAM do you have?
Why would you want to change the GPU weights between model types?
Info on my 4070ti (12GB)
seems like GPU Weights (MB) = ALL VRAM -
forge_inference_memory
# flux option_payload_flux = { "forge_inference_memory": 4096 } # sd option_payload_sd = { "forge_inference_memory": 10240 }
I'm struggling a bit to understand the logic of this... how much VRAM do you have?
Why would you want to change the GPU weights between model types?
Info on my 4070ti (12GB)
Never mind, just test. A10:)
I noticed that almost all "Options" are now available via API - see screenshot below.
I am making successful API calls for Flux generation.
However, I am getting flooded with the "
[Low GPU VRAM Warning]
" - which is not happening when I generate via the UI.I've scanned through all the available Options and txt2img payload params, and do not see the parameter that applies the "GPU Weights". The closest one I see is
forge_inference_memory
but this seems to be different? After playing around with this I can't seem to resolve the debug flood.Screenshot showing some Options for
override_settings
via API call: