[Bug]: Web Interface and API use different defaults for img2img

allo- commented 3 months ago

Checklist

[X] The issue exists after disabling all extensions
[X] The issue exists on a clean installation of webui
[ ] The issue is caused by an extension, but I believe it is caused by a bug in the webui
[X] The issue exists in the current version of the webui
[X] The issue has not been reported before recently
[ ] The issue has been reported before but has not been fixed yet

What happened?

The API uses different defaults than the webinterface for some settings. The probably most surprising setting is img2img inpainting_fill=0 (fill) in the API and inpainting_fill=1 (original) in the inpaint webinterface.

The API also set inpaint_full_res=True which I think corresponds to "only masked" when a mask is given and the webinterface defaults to "full image".

Steps to reproduce the problem

Use API endpoints with defaults

What should have happened?

It should behave like using the webinterface with defaults.

also there's a difference between user defaults and internal defaults, ui-config.json can change the dataults of the UI for the particular install, and this is persistence across versions so in the new version we change the internal default it will not update the user defaults if we make the API follow the user default specified in ui-config, then if you would send your payload to someone else who has different defaults with other different results which would be troublesome on the other hand if we make it follows the internal defaults, then you basically right in the same issue as now it would appears that it does not follow the same defaults

from my viewpoint someone that is properly using the API should fill all the options and not relying on the defaults as default is changed in the future it will cause issues

if you are having trouble filling all the payload options then use the toole sd-webui-api-payload-display by huchenlei it essentially converts the UI input to the API payload with a few exceptions like base 64 image input are placeholders in most cases you can just use the payload generated in the API and you will get the exact output from as you used the UI

some of the API payload keys are named after the internal names and not the front end webUI namesm like inpaint_full_res this is a bit inconvenient and confusing but since it's already there I think it's better not change it I think the solution would be to have better documentation (unfortunately that requires people to work on it and most people I know don't like writing documentations)

maybe some changes can be done if we ever have a V2 api, but at least when we are still on V1 we should probably not make these sort of changes

allo- commented 3 months ago

Thanks for the link to the tool, I think it will be useful. I once saw someone extract values from the gradio UI, I think to use them in a Qt app. This could also help with setting the correct API values.

I understand your point of view about changing existing APIs and have also thought about such things, but some defaults are rather unexpected and as far as I can see most settings are undocumented, which means you often have to look in the source code to find out what the values do. For a v2 API it would be very useful when it were easier to identify UI elements with API parameters. Maybe there could also be an expert setting that adds tooltips with the corresponding API names to UI elements or similar hints, since even the HTML source doesn't seem to contain the names of what is controlled by the elements.

I had problems with the inpaint options some time ago because the defaults are not like in the UI and I think it is undocumented what the possible values of the parameters are, so I had to experiment and read the source to inpaint without artifacts. I would also think that using img2img without a mask would never be useful when using anything other than original content for the fill.

My latest unexpected problem was the one described in the control net issue, where the defaults caused a blocky pattern when using the tile control, and it was unclear what was causing it. The relevant parameters do not even seem to be visible in the UI, but without changing them from their default value (64) to 0, the results are broken.

This is the relevant code for anyone with similar problems:

Controlnet:

tile_controlnet = webuiapi.ControlNetUnit(module=tile_module, model=tile_model,
    threshold_a=0, threshold_b=0)

I haven't found documentation for threshold_a and threshold_b and they default to 64. I think they are generic parameters used for different purposes in different control nets. The tile control net causes blocky artifacts when the thresholds are set to 64 (API default) and the blocks get larger for larger values. Using the tile resolution as a value did not give good images, but 0 seems to give the same result as in the webui.

img2img with sd_upscale:

upscale_factor = 2.0
tile_size_x = 512
tile_size_y = 512
results = api.img2img(
            images=[input_image],
            prompt=prompt,
            negative_prompt=negative_prompt,
            width=tile_size_x,
            height=tile_size_y,
            inpainting_fill=1, # Use the original content
            controlnet_units=tile_controlnet
            script_name="SD upscale",
            script_args=[None, 64, 1, upscale_factor],
        )

The script_args can be found in the sdupscale.py script: `def run(p, , overlap, upscaler_index, scale_factor)`

p is internal
overlap and scale_factor do what one would expect
upscaler_index uses modules.shared.sd_upscalers which is generated at runtime. I assume the order to be the same as in the webui and indices to start with 0.

AUTOMATIC1111 / stable-diffusion-webui