lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
7.6k stars 731 forks source link

Cannot Adjust "GPU Weights" via API #1902

Open altoiddealer opened 1 day ago

altoiddealer commented 1 day ago

I noticed that almost all "Options" are now available via API - see screenshot below.

I am making successful API calls for Flux generation.

However, I am getting flooded with the "[Low GPU VRAM Warning]" - which is not happening when I generate via the UI.

I've scanned through all the available Options and txt2img payload params, and do not see the parameter that applies the "GPU Weights". The closest one I see is forge_inference_memory but this seems to be different? After playing around with this I can't seem to resolve the debug flood.

Screenshot showing some Options for override_settings via API call:

Screenshot 2024-09-23 140050

Screenshot 2024-09-23 140402

altoiddealer commented 1 day ago

Actually, I think I was hallucinating the forge_additional_modules working via override_settings.

It's not.

But the modules are setting when posting to the /sdapi/v1/options endpoint

altoiddealer commented 1 day ago

One other oddity I am going to lump in with this issue.

If I include a list of forge_additional_modules when using the /sdapi/v1/options endpoint, I get a Cuda OOM on the next API generation.

If I include an empty list for forge_additional_modules when using the /sdapi/v1/options endpoint... but then select the same modules in the UI one by one - I do NOT get Cuda OOM on the next API generation. (But I do experience the Issue in OP).

Tengyang-Chen commented 14 hours ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

altoiddealer commented 14 hours ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

Thanks! When I made this Issue, I thought that override_settings was working via API, but apparently not.

I'll play around with forge_inference_memory tomorrow via the /options/ endpoint and see if I can work it out

Tengyang-Chen commented 9 hours ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

Thanks! When I made this Issue, I thought that override_settings was working via API, but apparently not.

I'll play around with forge_inference_memory tomorrow via the /options/ endpoint and see if I can work it out

FYI, I tested several times and switched between flux and sd. Everything seems to work well via sdapi/v1/options and immediately after, sdapi/v1/txt2img

However, if I refresh (Ctrl/Cmd + F5) the web UI page before sending sdapi/v1/txt2img to check if the parameters are set properly (confirming that sdapi/v1/options worked), then send sdapi/v1/txt2img, an error occurs. Seems that refresh_memory_management_settings function is triggered, resetting the environment variables to: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}. This is followed by a [Low VRAM Warning], and then an error indicating that Checkpoint FLUX/flux1-dev-Q4_0.gguf(selected in sdapi/v1/options) was not found, resulting in the fallback to a default ckpt :(


# sample
# flux
option_payload_flux = {
    "sd_model_checkpoint": "FLUX/flux1-dev-Q4_0.gguf",
    "forge_unet_storage_dtype": "Automatic (fp16 LoRA)",
    "forge_additional_modules": ["/data/stable-diffusion-webui-forge/models/VAE/ae.safetensors",
                                 "/data/stable-diffusion-webui-forge/models/text_encoder/clip_l.safetensors",
                                 "/data/stable-diffusion-webui-forge/models/text_encoder/t5xxl_fp8_e4m3fn.safetensors"],
    "forge_inference_memory": 4096,
    "forge_preset": "flux",
}
# sd
option_payload_sd = {
    "sd_model_checkpoint": "majicmixRealistic_v7.safetensors",
    "forge_unet_storage_dtype": "Automatic",
    "forge_additional_modules": [],
    "forge_inference_memory": 10240,
    "forge_preset": "sd",
}```
altoiddealer commented 2 hours ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

# flux
option_payload_flux = {
    "forge_inference_memory": 4096
}
# sd
option_payload_sd = {
    "forge_inference_memory": 10240
}

I'm struggling a bit to understand the logic of this... how much VRAM do you have?

Why would you want to change the GPU weights between model types?

Info on my 4070ti (12GB)

Screenshot 2024-09-25 101509

Tengyang-Chen commented 1 hour ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory


# flux

option_payload_flux = {

    "forge_inference_memory": 4096

}

# sd

option_payload_sd = {

    "forge_inference_memory": 10240

}

I'm struggling a bit to understand the logic of this... how much VRAM do you have?

Why would you want to change the GPU weights between model types?

Info on my 4070ti (12GB)

Screenshot 2024-09-25 101509

Never mind, just test. A10:)