Cannot Adjust "GPU Weights" via API

altoiddealer commented 2 months ago

Edit

The solution (currently) is to use /sdapi/v1/options endpoint as mentioned in this comment below

Also:

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

I noticed that almost all "Options" are now available via API - see screenshot below.

I am making successful API calls for Flux generation.

However, I am getting flooded with the "[Low GPU VRAM Warning]" - which is not happening when I generate via the UI.

I've scanned through all the available Options and txt2img payload params, and do not see the parameter that applies the "GPU Weights". The closest one I see is forge_inference_memory but this seems to be different? After playing around with this I can't seem to resolve the debug flood.

Screenshot showing some Options for override_settings via API call:

Screenshot 2024-09-23 140050

Screenshot 2024-09-23 140402

altoiddealer commented 2 months ago

Actually, I think I was hallucinating the forge_additional_modules working via override_settings.

It's not.

But the modules are setting when posting to the /sdapi/v1/options endpoint

altoiddealer commented 2 months ago

One other oddity I am going to lump in with this issue.

If I include a list of forge_additional_modules when using the /sdapi/v1/options endpoint, I get a Cuda OOM on the next API generation.

If I include an empty list for forge_additional_modules when using the /sdapi/v1/options endpoint... but then select the same modules in the UI one by one - I do NOT get Cuda OOM on the next API generation. (But I do experience the Issue in OP).

Tengyang-Chen commented 2 months ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

altoiddealer commented 2 months ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

Thanks! When I made this Issue, I thought that override_settings was working via API, but apparently not.

I'll play around with forge_inference_memory tomorrow via the /options/ endpoint and see if I can work it out

Tengyang-Chen commented 2 months ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

Thanks! When I made this Issue, I thought that override_settings was working via API, but apparently not.

I'll play around with forge_inference_memory tomorrow via the /options/ endpoint and see if I can work it out

FYI, I tested several times and switched between flux and sd. Everything seems to work well via sdapi/v1/options and immediately after, sdapi/v1/txt2img。

However, if I refresh (Ctrl/Cmd + F5) the web UI page before sending sdapi/v1/txt2img to check if the parameters are set properly (confirming that sdapi/v1/options worked), then send sdapi/v1/txt2img, an error occurs. Seems that refresh_memory_management_settings function is triggered, resetting the environment variables to: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}. This is followed by a [Low VRAM Warning], and then an error indicating that Checkpoint FLUX/flux1-dev-Q4_0.gguf(selected in sdapi/v1/options) was not found, resulting in the fallback to a default ckpt :(


# sample
# flux
option_payload_flux = {
    "sd_model_checkpoint": "FLUX/flux1-dev-Q4_0.gguf",
    "forge_unet_storage_dtype": "Automatic (fp16 LoRA)",
    "forge_additional_modules": ["/data/stable-diffusion-webui-forge/models/VAE/ae.safetensors",
                                 "/data/stable-diffusion-webui-forge/models/text_encoder/clip_l.safetensors",
                                 "/data/stable-diffusion-webui-forge/models/text_encoder/t5xxl_fp8_e4m3fn.safetensors"],
    "forge_inference_memory": 4096,
    "forge_preset": "flux",
}
# sd
option_payload_sd = {
    "sd_model_checkpoint": "majicmixRealistic_v7.safetensors",
    "forge_unet_storage_dtype": "Automatic",
    "forge_additional_modules": [],
    "forge_inference_memory": 10240,
    "forge_preset": "sd",
}```

altoiddealer commented 2 months ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

# flux
option_payload_flux = {
    "forge_inference_memory": 4096
}
# sd
option_payload_sd = {
    "forge_inference_memory": 10240
}

I'm struggling a bit to understand the logic of this... how much VRAM do you have?

Why would you want to change the GPU weights between model types?

Info on my 4070ti (12GB)

Screenshot 2024-09-25 101509

Tengyang-Chen commented 2 months ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory
# flux

option_payload_flux = {

    "forge_inference_memory": 4096

}

# sd

option_payload_sd = {

    "forge_inference_memory": 10240

}
I'm struggling a bit to understand the logic of this... how much VRAM do you have?

Why would you want to change the GPU weights between model types?

Info on my 4070ti (12GB)

Never mind, just test. A10:)

altoiddealer commented 2 months ago

seems like GPU Weights (MB) = ALL VRAM - forge_inference_memory

It turns out you are correct about this - when I am using the API I am also running an LLM.

In Task Manager, I noted the difference between the memory used/available in "Dedicated GPU memory" - which in this case was approx 4096. It's working successfully for API now.

However, if I refresh (Ctrl/Cmd + F5) the web UI page before sending sdapi/v1/txt2img to check if the parameters are set properly (confirming that sdapi/v1/options worked), then send sdapi/v1/txt2img, an error occurs.

When I refresh the UI I am not getting this issue... One thing to note is that I am not sending a forge_preset option when making the API call. Maybe it has something to do with it? shrugs

GPU-server commented 1 month ago

@altoiddealer could you provide a full example of t2i script on how to call flux please? I only got script for old SD or where can I find it?

altoiddealer commented 1 month ago

@altoiddealer could you provide a full example of t2i script on how to call flux please? I only got script for old SD or where can I find it?

heya - launch Forge with --api --listen. When you are looking at the UI, just add /docs to the end of the URL. Example: http://127.0.0.1:7860/docs

You can see all the API endpoints here, including txt2img. You can "Try it out" and it will show all the default values. I don't think any parameters are "required" to be provided to use this endpoint, but the img2img one won't work without providing an input image (obviously).

As for Flux, you'd want to ensure that CFG is low like 1.0 and the distilled CFG is correct...

You'll need to adjust values like text encoders, GPU weights, etc either in the UI directly or youll post to sd-models / options to change models / options.

I have this PR which, once merged, will restore override_settings which you can use in your txt2img / img2img payloads to temporarily assert settings/options.

https://github.com/lllyasviel/stable-diffusion-webui-forge/pull/2027

GPU-server commented 1 month ago

@altoiddealer could you provide a full example of t2i script on how to call flux please? I only got script for old SD or where can I find it?

heya - launch Forge with --api --listen. When you are looking at the UI, just add /docs to the end of the URL. Example: http://127.0.0.1:7860/docs

You can see all the API endpoints here, including txt2img. You can "Try it out" and it will show all the default values. I don't think any parameters are "required" to be provided to use this endpoint, but the img2img one won't work without providing an input image (obviously).

As for Flux, you'd want to ensure that CFG is low like 1.0 and the distilled CFG is correct...

You'll need to adjust values like text encoders, GPU weights, etc either in the UI directly or youll post to sd-models / options to change models / options.

I have this PR which, once merged, will restore override_settings which you can use in your txt2img / img2img payloads to temporarily assert settings/options.

2027

So for this example:

{
  "prompt": " an example for flux dev",
  "negative_prompt": "",
  "styles": [
    "string"
  ],
  "seed": -1,
  "subseed": -1,
  "subseed_strength": 0,
  "seed_resize_from_h": -1,
  "seed_resize_from_w": -1,
  "sampler_name": "string",
  "scheduler": "string",
  "batch_size": 1,
  "n_iter": 1,
  "steps": 50,
  "cfg_scale": 7,
  "distilled_cfg_scale": 3.5,
  "width": 512,
  "height": 512,
  "restore_faces": true,
  "tiling": true,
  "do_not_save_samples": false,
  "do_not_save_grid": false,
  "eta": 0,
  "denoising_strength": 0,
  "s_min_uncond": 0,
  "s_churn": 0,
  "s_tmax": 0,
  "s_tmin": 0,
  "s_noise": 0,
  "override_settings": {},
  "override_settings_restore_afterwards": true,
  "refiner_checkpoint": "string",
  "refiner_switch_at": 0,
  "disable_extra_networks": false,
  "firstpass_image": "string",
  "comments": {},
  "enable_hr": false,
  "firstphase_width": 0,
  "firstphase_height": 0,
  "hr_scale": 2,
  "hr_upscaler": "string",
  "hr_second_pass_steps": 0,
  "hr_resize_x": 0,
  "hr_resize_y": 0,
  "hr_checkpoint_name": "string",
  "hr_sampler_name": "string",
  "hr_scheduler": "string",
  "hr_prompt": "",
  "hr_negative_prompt": "",
  "hr_cfg": 1,
  "hr_distilled_cfg": 3.5,
  "force_task_id": "string",
  "sampler_index": "Euler",
  "script_name": "string",
  "script_args": [],
  "send_images": true,
  "save_images": false,
  "alwayson_scripts": {},
  "infotext": "string"
}

This one would work as long as the FLUX model is already set up? Otherwise we would need the overide settings to change the model? Which I found in this example:

So instead of Anything-V3.0-pruned we would insert : "flux1-dev-bnb-nf4-v2.safetensors"?

I inserted info in the prompt positive, but left the negative empty. Is my payload readdy to use for flux, can you verify please?

altoiddealer commented 1 month ago

Your cfg_scale value is 7 so won’t be good output for Flux.

you don’t need to send any flux model name etc to txt2img endpoint.

You may get an error if sending literally “string” for sampler and scheduler. Try Euler and simple, respectively.

You don’t need to send all this info, only parameters that you want to be changed from defaults

lllyasviel / stable-diffusion-webui-forge

Cannot Adjust "GPU Weights" via API #1902

Edit

2027