Closed altoiddealer closed 2 months ago
Actually, I think I was hallucinating the forge_additional_modules
working via override_settings
.
It's not.
But the modules are setting when posting to the /sdapi/v1/options
endpoint
One other oddity I am going to lump in with this issue.
If I include a list of forge_additional_modules
when using the /sdapi/v1/options
endpoint, I get a Cuda OOM on the next API generation.
If I include an empty list for forge_additional_modules
when using the /sdapi/v1/options
endpoint... but then select the same modules in the UI one by one - I do NOT get Cuda OOM on the next API generation. (But I do experience the Issue in OP).
seems like
GPU Weights (MB) = ALL VRAM - forge_inference_memory
seems like GPU Weights (MB) = ALL VRAM -
forge_inference_memory
Thanks! When I made this Issue, I thought that override_settings
was working via API, but apparently not.
I'll play around with forge_inference_memory
tomorrow via the /options/ endpoint and see if I can work it out
seems like GPU Weights (MB) = ALL VRAM -
forge_inference_memory
Thanks! When I made this Issue, I thought that
override_settings
was working via API, but apparently not.I'll play around with
forge_inference_memory
tomorrow via the /options/ endpoint and see if I can work it out
FYI, I tested several times and switched between flux and sd. Everything seems to work well via sdapi/v1/options
and immediately after, sdapi/v1/txt2img
。
However, if I refresh (Ctrl/Cmd + F5) the web UI page before sending sdapi/v1/txt2img
to check if the parameters are set properly (confirming that sdapi/v1/options worked), then send sdapi/v1/txt2img
, an error occurs. Seems that refresh_memory_management_settings
function is triggered, resetting the environment variables to: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}
. This is followed by a [Low VRAM Warning]
, and then an error indicating that Checkpoint FLUX/flux1-dev-Q4_0.gguf(selected in sdapi/v1/options
) was not found, resulting in the fallback to a default ckpt :(
# sample
# flux
option_payload_flux = {
"sd_model_checkpoint": "FLUX/flux1-dev-Q4_0.gguf",
"forge_unet_storage_dtype": "Automatic (fp16 LoRA)",
"forge_additional_modules": ["/data/stable-diffusion-webui-forge/models/VAE/ae.safetensors",
"/data/stable-diffusion-webui-forge/models/text_encoder/clip_l.safetensors",
"/data/stable-diffusion-webui-forge/models/text_encoder/t5xxl_fp8_e4m3fn.safetensors"],
"forge_inference_memory": 4096,
"forge_preset": "flux",
}
# sd
option_payload_sd = {
"sd_model_checkpoint": "majicmixRealistic_v7.safetensors",
"forge_unet_storage_dtype": "Automatic",
"forge_additional_modules": [],
"forge_inference_memory": 10240,
"forge_preset": "sd",
}```
seems like GPU Weights (MB) = ALL VRAM -
forge_inference_memory
# flux
option_payload_flux = {
"forge_inference_memory": 4096
}
# sd
option_payload_sd = {
"forge_inference_memory": 10240
}
I'm struggling a bit to understand the logic of this... how much VRAM do you have?
Why would you want to change the GPU weights between model types?
Info on my 4070ti (12GB)
seems like GPU Weights (MB) = ALL VRAM -
forge_inference_memory
# flux option_payload_flux = { "forge_inference_memory": 4096 } # sd option_payload_sd = { "forge_inference_memory": 10240 }
I'm struggling a bit to understand the logic of this... how much VRAM do you have?
Why would you want to change the GPU weights between model types?
Info on my 4070ti (12GB)
Never mind, just test. A10:)
seems like GPU Weights (MB) = ALL VRAM -
forge_inference_memory
It turns out you are correct about this - when I am using the API I am also running an LLM.
In Task Manager, I noted the difference between the memory used/available in "Dedicated GPU memory" - which in this case was approx 4096. It's working successfully for API now.
However, if I refresh (Ctrl/Cmd + F5) the web UI page before sending sdapi/v1/txt2img to check if the parameters are set properly (confirming that sdapi/v1/options worked), then send sdapi/v1/txt2img, an error occurs.
When I refresh the UI I am not getting this issue...
One thing to note is that I am not sending a forge_preset
option when making the API call. Maybe it has something to do with it? shrugs
@altoiddealer could you provide a full example of t2i script on how to call flux please? I only got script for old SD or where can I find it?
@altoiddealer could you provide a full example of t2i script on how to call flux please? I only got script for old SD or where can I find it?
heya - launch Forge with --api --listen
. When you are looking at the UI, just add /docs
to the end of the URL.
Example: http://127.0.0.1:7860/docs
You can see all the API endpoints here, including txt2img. You can "Try it out" and it will show all the default values. I don't think any parameters are "required" to be provided to use this endpoint, but the img2img one won't work without providing an input image (obviously).
As for Flux, you'd want to ensure that CFG is low like 1.0
and the distilled CFG is correct...
You'll need to adjust values like text encoders, GPU weights, etc either in the UI directly or youll post to sd-models
/ options
to change models / options.
I have this PR which, once merged, will restore override_settings
which you can use in your txt2img / img2img payloads to temporarily assert settings/options.
https://github.com/lllyasviel/stable-diffusion-webui-forge/pull/2027
@altoiddealer could you provide a full example of t2i script on how to call flux please? I only got script for old SD or where can I find it?
heya - launch Forge with
--api --listen
. When you are looking at the UI, just add/docs
to the end of the URL. Example:http://127.0.0.1:7860/docs
You can see all the API endpoints here, including txt2img. You can "Try it out" and it will show all the default values. I don't think any parameters are "required" to be provided to use this endpoint, but the img2img one won't work without providing an input image (obviously).
As for Flux, you'd want to ensure that CFG is low like
1.0
and the distilled CFG is correct...You'll need to adjust values like text encoders, GPU weights, etc either in the UI directly or youll post to
sd-models
/options
to change models / options.I have this PR which, once merged, will restore
override_settings
which you can use in your txt2img / img2img payloads to temporarily assert settings/options.2027
So for this example:
{
"prompt": " an example for flux dev",
"negative_prompt": "",
"styles": [
"string"
],
"seed": -1,
"subseed": -1,
"subseed_strength": 0,
"seed_resize_from_h": -1,
"seed_resize_from_w": -1,
"sampler_name": "string",
"scheduler": "string",
"batch_size": 1,
"n_iter": 1,
"steps": 50,
"cfg_scale": 7,
"distilled_cfg_scale": 3.5,
"width": 512,
"height": 512,
"restore_faces": true,
"tiling": true,
"do_not_save_samples": false,
"do_not_save_grid": false,
"eta": 0,
"denoising_strength": 0,
"s_min_uncond": 0,
"s_churn": 0,
"s_tmax": 0,
"s_tmin": 0,
"s_noise": 0,
"override_settings": {},
"override_settings_restore_afterwards": true,
"refiner_checkpoint": "string",
"refiner_switch_at": 0,
"disable_extra_networks": false,
"firstpass_image": "string",
"comments": {},
"enable_hr": false,
"firstphase_width": 0,
"firstphase_height": 0,
"hr_scale": 2,
"hr_upscaler": "string",
"hr_second_pass_steps": 0,
"hr_resize_x": 0,
"hr_resize_y": 0,
"hr_checkpoint_name": "string",
"hr_sampler_name": "string",
"hr_scheduler": "string",
"hr_prompt": "",
"hr_negative_prompt": "",
"hr_cfg": 1,
"hr_distilled_cfg": 3.5,
"force_task_id": "string",
"sampler_index": "Euler",
"script_name": "string",
"script_args": [],
"send_images": true,
"save_images": false,
"alwayson_scripts": {},
"infotext": "string"
}
This one would work as long as the FLUX model is already set up? Otherwise we would need the overide settings to change the model? Which I found in this example:
So instead of Anything-V3.0-pruned we would insert : "flux1-dev-bnb-nf4-v2.safetensors"?
I inserted info in the prompt positive, but left the negative empty. Is my payload readdy to use for flux, can you verify please?
Your cfg_scale value is 7 so won’t be good output for Flux.
you don’t need to send any flux model name etc to txt2img endpoint.
You may get an error if sending literally “string” for sampler and scheduler. Try Euler and simple, respectively.
You don’t need to send all this info, only parameters that you want to be changed from defaults
Edit
The solution (currently) is to use
/sdapi/v1/options
endpoint as mentioned in this comment belowAlso:
I noticed that almost all "Options" are now available via API - see screenshot below.
I am making successful API calls for Flux generation.
However, I am getting flooded with the "
[Low GPU VRAM Warning]
" - which is not happening when I generate via the UI.I've scanned through all the available Options and txt2img payload params, and do not see the parameter that applies the "GPU Weights". The closest one I see is
forge_inference_memory
but this seems to be different? After playing around with this I can't seem to resolve the debug flood.Screenshot showing some Options for
override_settings
via API call: