Closed Xijamk closed 11 months ago
I have encountered the same issue. When using the API to generate results with a fixed seed, the output differs from what is obtained through the web UI. The API's mask parameter was obtained by capturing network traffic using the browser's F12 developer tools, so I am certain that all the parameters are guaranteed to be 100% identical.
I have noticed that the results of the Controlnet preprocessor obtained through the API are inconsistent with those obtained through the web UI. This issue is likely the cause of the discrepancy.
Here are my parameters: { "pixel_perfect": false, "control_mode": 0, "module": "inpaint_only", "image": { "image": "...Image base64", "mask": "...Mask base64" }, "weight": 1, "model": "control_v11p_sd15_inpaint [ebff9138]", "enabled": false }
I have encountered the same issue. When using the API to generate results with a fixed seed, the output differs from what is obtained through the web UI. The API's mask parameter was obtained by capturing network traffic using the browser's F12 developer tools, so I am certain that all the parameters are guaranteed to be 100% identical.
I have noticed that the results of the Controlnet preprocessor obtained through the API are inconsistent with those obtained through the web UI. This issue is likely the cause of the discrepancy.
Here are my parameters: { "pixel_perfect": false, "control_mode": 0, "module": "inpaint_only", "image": { "image": "...Image base64", "mask": "...Mask base64" }, "weight": 1, "model": "control_v11p_sd15_inpaint [ebff9138]", "enabled": false }
@AIGC404 You have to set enabled: true
in your payload.
@Xijamk We have verified that API can completely reproduce the result from A1111. You can try use https://github.com/huchenlei/sd-webui-api-payload-display to dump out corresponding API payload from your A1111 runs.
Sorry, there seems to be an issue with the API demo parameters I provided earlier. However, even after changing the "enabled" parameter to true, I'm still experiencing the same problem.
I'm certain that there is an issue, so I kindly request you, expert, to test and verify it. Additionally, you mentioned using "sd-webui-api-payload-display" to retrieve the API parameters, but the returned values include some that cannot be directly used and have extra parameters.
Here's a screenshot of the parameters for ControlNet:
And here's a screenshot of the webUI configuration:
Here is the file with the parameters. api-demo.zip
@huchenlei help me
@Xijamk We have verified that API can completely reproduce the result from A1111. You can try use https://github.com/huchenlei/sd-webui-api-payload-display to dump out corresponding API payload from your A1111 runs.
@huchenlei
If I try to copy-paste the api using that extension, I receive this error:
If I delete that, I keep receiving the same error about other fields (batch_images, input_mode, loopback, output_dir), and if I remove all of that, the API call completes, but the returned image is a random girl with closed eyes.
Am I doing something wrong?, I've A1111 in 1.6.1 version and ControlNet in 1.1.417
@huchenlei, there seems to be an issue. I just tried the new examples provided at https://github.com/Mikubill/sd-webui-controlnet/pull/2317/files, and I am encountering the same problem I described in this thread. The system is returning the original image without any changes.
@Xijamk Please update your ControlNet to latest version (1.1.422). The latest version will ignore unrecognized params.
@huchenlei, I've updated and am no longer receiving the unrecognized parameters errors. However, I'm still facing the same issue with inpainting; nothing changes. I've attached the exact payload that I'm using. Could you try it yourself to rule out environment-related problems? Currently, I have only the ControlNet and payload extensions enabled. PayloadExample.txt
@AIGC404 have you had any luck?
@Xijamk I think it you have some problems with your mask image. Everything in the mask image has to be either rgb(0, 0, 0) or rbg(255, 255, 255).
I used the input image/input mask in my example with your payload and there is no problem doing the inpaint.
I don't think so. I've tried using your images as well, but with no luck. Here is the payload featuring your images, and yet, there's still no change. It seems there might be something conflicting with the inpainting process via the API. PayloadExampleHuchenleiImages.txt
You can also try to update your A1111 to latest version. I am testing under A1111 (f92d6149).
I've updated A1111 to the latest version and the problem persist, the mask is not being taken into account via API.
@huchenlei I've done a fresh install of A1111 and ControlNet on a new PC, but it's not working there either.
In an effort to shed some light on the problem, I've made a video demonstrating the problem step by step:
Video: https://drive.google.com/file/d/1zFQGgEgFVg2b3VqAQubauTDe2MxKjFg4/view Payload: PayloadExampleHuchenleiImages.txt
@Xijamk I think your problem is using base64guru.
import cv2
import base64
guru_base64 = "iVBORw0KGgoAAAANSUhEUgAAAgAAAAMAAQMAAABowU0NAAAAAXNSR0IB2cksfwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAZQTFRFAAAA////pdmf3QAAAIdJREFUeJztzDENAAAIA7D5Nw0idpCQVkATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAN6YkkAgEAgEAoFAIBAIBAKBQCAQCAQCgUAgEAgEAoFAIBAIBAKBQCAQCAQCgUAgEAgEAoFAIBAIBAKBQCAQCAQCgUAgEAgEAoFAIBAIBAKBQCA4CRYO9apHKdWG5QAAAABJRU5ErkJggg=="
def read_image(img_path: str) -> str:
img = cv2.imread(img_path)
_, bytes = cv2.imencode(".png", img)
encoded_image = base64.b64encode(bytes).decode("utf-8")
return encoded_image
mask_image = read_image("mask.png")
print(len(guru_base64))
print(len(mask_image))
I compared the output of guru base64 result and base64 encoding from example code. Here is the console output:
> python .\compare.py
328
3980
@huchenlei The base64 you submitted was obtained through a CV method, but traditional base64 encoding is usually generated using PIL. I believe you should also support the base64 encoding generated by PIL.
from io import BytesIO
from PIL import Image
import cv2
import base64
# Read an image using OpenCV and convert it to base64.
def cv_read_image(img_path: str) -> str:
img = cv2.imread(img_path)
_, bytes = cv2.imencode(".png", img)
encoded_image = base64.b64encode(bytes).decode("utf-8")
return encoded_image
# Read an image in base64 using PIL
def pil_read_image(img_path):
# 将图像转换为Base64编码
image_buffer = BytesIO()
Image.open(img_path).save(image_buffer, format='PNG')
return base64.b64encode(image_buffer.getvalue()).decode('utf-8')
cv_mask_image = cv_read_image("mask.png")
pil_mask_image = pil_read_image("mask.png")
print(len(cv_mask_image))
print(len(pil_mask_image))
@Xijamk No, I haven't had much luck
I ran the example script api_inpaint.py located in example/inpaint_example, but the returned result is still problematic. The preprocessed image returned is the same as the original image. The version of the web UI I used is v1.6.0-2-g4afaaf8a, and the controlnet version is 1.1.422.
txt2img inpaint result:
img2img inpaint result:
I tried switching multiple versions of the web UI, but none of them had any effect. I have been very unlucky.
I'm also having an issue with the mask not being applied via the API.
I've tried both {"input_image": b64str, "mask":b64str}
and {"image": {"image": b64str, "mask": b64str}}
. In both tests the results images[0]
and images[1]
are identical.
Here's my JSON (without the encoded image, because that's too big), sent to Txt2Img.
{
"width": 1024,
"height": 768,
"prompt": "Baseball",
"negative_prompt": "",
"batch_size": 1,
"cfg_scale": 9,
"seed": -1,
"subseed": -1,
"subseed_strength": 0,
"enable_hr": false,
"alwayson_scripts": {
"controlnet": {
"args": [
{
"module": "inpaint_only+lama",
"model": "control_v11p_sd15_inpaint [ebff9138]",
"weight": 1,
"resize_mode": 2,
"lowvram": true,
"processor_res": 512,
"threshold_a": -2,
"threshold_b": -3,
"guidance_start": 0,
"guidance_end": 1,
"control_mode": 2,
"pixel_perfect": true,
"image": {
"image": "Replace with attached image",
"mask": "iVBORw0KGgoAAAANSUhEUgAABAAAAAMAAQMAAACAdIdOAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAADI0lEQVR4nO3dQW6jMBQGYKouuuwROApHg6NxlByhyyyiZKTRTGokT2cF/1P4vDWSP8l+zwYje3jsXNbhu4yd+mFvwFIJ8HlKwABQCfARANwBANKAGwAAAABAKUBiRQQAAAAAAAAAAAAAAAAAAAAAAAAAAFALEP9YDXBKQHzzGgAAAAAAAAAAAAAAAAAAoBbgHSAAeAAAAAAAtIA3AIAEYAEAAAAAAAAAAACoBOg1BvD6gBUAAAAAAAAAAAAAAACgBcwAAcAFAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIA1YAQBaQKceAAAAAAAAAOD1AQsAQNP+GwAAAAAAAABAAjAAAFQCvAMABAB3AIBSgA8AAAAAAAAAAACAMwJuAAAt4BMAAAAAAAAAAAAAAAAAAAAgALgCALSAEQDgjIAvAIBSgAkAAAAAAAAAAAAgALgAAJQCzAAAAcAKANACeg8AAAAAAAAAAAAAAADsDVgAKgF6R1IBAAAAAAAAAAAAAAAA7A4YACoBeseUAgAAAAAAAAAA7A3478nxLw+IdwEAQBwQD8M4IN4FAABxQDwM44B4FwAAxAHxMASIjwEAgHgUAMTHAAAAQC1AfPs+8nf9BjCnAdPpAWMAsLSAxGG5G0DiSqlagMR1wxtA4ub3WoBeY4fmgTxgPhxwLwaYDgfcigHG8wGuxQCd+fjVAV/FAJ0FAcDOgAvAFtBZFAIAAAAAAAC8OiC+JAOIA67FACd8OY0D4l9I4oB7McB0OOBRDDAfD1hqAToP7A1YSwESWzaXNGCzIEhs220AiY3LaxoQv+13MxmMacAUAGwmgzkBaHNxr353wJoGNKkw8iNTm4l6aWB/QJMIemlgf0ATh2ME0MThlAF8h8GcATzDoBsEBwCe82E3CA4APEdhNwgOADyT8ZQC/B0E/doDALefhsARgD99MOYAt38H4TGA36E4RQE/FAAAAAAAAAAAAIBfXOtmb1ajPP0AAAAASUVORK5CYII="
}
}
]
}
},
"override_settings": {
"sd_model_checkpoint": "Mao's_mix_anime_V1.ckpt [5228b68555]",
"sd_vae": "vae-ft-mse-840000-ema-pruned.vae.pt"
},
"sampler_name": "DPM++ 2M SDE Karras",
"steps": 50,
"override_settings_restore_afterwards": false,
"n_iter": 1
}
And here's my test image
I know that ControlNet is getting the image, because Txt2Img would generate an entirely new "baseball" image if ControlNet was doing nothing.
@DrCyanide Do you have any success running https://github.com/Mikubill/sd-webui-controlnet/blob/main/example/inpaint_example/api_inpaint.py? What are the results look like?
@DrCyanide Do you have any success running https://github.com/Mikubill/sd-webui-controlnet/blob/main/example/inpaint_example/api_inpaint.py? What are the results look like?
Those results look correct. I'm going to have to see if I can spot the differences and find out why my JSON fails.
Looks like the difference is in the Base64 encoding for Monochrome image vs Color image. The Base64 mask in my JSON above was generated from a Monochromatic image (converting the transparency of a layer in Krita to black and the rest to white). Writing the Base64 mask out to a PNG seems to add the RGB values back, which can then be read in by the example script, causing it to act as expected.
I'll have to dig around and see if I can find a way to convert the Monochromatic mask back to Color before sending it to ControlNet.
I haven't updated my ControlNet for the test (still using v1.1.415), so if newer versions are more forgiving on this then that might be a good reason to update.
I leave a comment here. I hope it will be helpful for someone in the future.
So I faced a similar issue but with ComfyUI. Masking in UI works perfectly, but via API it returns the original image.
The problem is hiding in the way how you set the mask in the image before sending it to API. I found in the original ComfyUI repo the way they are extracting masks from Image. Knowing that I can set the mask properly.
I leave the code here.
white_background = Image.new("RGBA", original_image.size, (255, 255, 255, 255))
# Create a mask from the canvas
mask_image = Image.fromarray(mask_canvas.image_data.astype(np.uint8))
# Convert images to tensors
original_tensor = torch.from_numpy(np.array(original_image)).permute(2, 0, 1) # Convert to CxHxW
mask_tensor = torch.from_numpy(np.array(mask_image)).permute(2, 0, 1)
# Create the output tensor
output_tensor = original_tensor.clone()
# Set alpha channel based on the mask
red_channel = mask_tensor[0] # Get the red channel from the mask
output_tensor[3] = torch.where(red_channel == 255, torch.tensor(0), original_tensor[3]) # Set alpha to 0 where the mask is red
# Convert output tensor back to an image
output_image = Image.fromarray(output_tensor.permute(1, 2, 0).byte().numpy(), mode='RGBA')
st.image(output_image, caption='Image with Selected Area Black', use_column_width=True)
Is there an existing issue for this?
What happened?
I am attempting to use txt2img and controlnet with an image and a mask, but I'm encountering issues where the mask seems ineffective. This is a shift from my previous workflow, where I used im2img without controlnet for inpainting. Now, my goal is to utilize txt2img with controlnet for the same purpose. Despite reviewing both resolved and open issues, and examining the payload through an extension, I am unable to diagnose the problem.
PLEASE HELP, I've been stuck on this for the last 2 months now
In the example below, I'm trying to close her eyes via API without success, I've already tried a thousand things;
Payload used in this example: {
"prompt": "(grainy cinematic photography:1.2) shot on Bessa R2A Cinestill photo of a beautiful young italian woman, wavy hair, cute smile, closed eyes", "negative_prompt": "", "sampler_name": "DPM++ 2M Karras", "batch_size": 1, "steps": 20, "cfg_scale": 7, "width": 512, "height": 768, "alwayson_scripts": { "controlnet": { "args": [ { "model": "control_v11p_sd15_inpaint [ebff9138]", "module": "inpaint_only+lama", "resize_mode": 1, "control_mode": 0, "image": {"image": base64string, "mask": base64mask} } ] } } }
Image:
Mask:
Results via UI:
Results via API is the exact same original image.
Mask returned from the API (second image from the response)
Steps to reproduce the problem
What should have happened?
The response from the API being the same as the UI.
Commit where the problem happens
webui: txt2img controlnet: inpaint
What browsers do you use to access the UI ?
Google Chrome
Command Line Arguments
List of enabled extensions
Controlnet, Dynamic Prompts, Additional Networks, Adetailer, AnimateDiff
Console logs
Additional information
No response