Closed EchoHeadache closed 4 months ago
To address the distortion and artifacting issue when using large masks with adetailer, consider adjusting specific parameters in the ADetailerArgs
class from the adetailer/args.py
file. Here's a focused approach:
Dilate/Erode (ad_dilate_erode
): Adjust this parameter to control the dilation and erosion of the mask. A negative value contracts the mask, which could mitigate the issue without manual erosion.
Mask Blur (ad_mask_blur
): Increase this value to blur the edges of the mask, creating a smoother transition between the inpainted area and the original image, potentially reducing artifacts.
Inpaint Only Masked (ad_inpaint_only_masked
) and Inpaint Padding (ad_inpaint_only_masked_padding
): Ensure ad_inpaint_only_masked
is True
and adjust ad_inpaint_only_masked_padding
to limit inpainting strictly to the masked area, reducing edge artifacts.
Manual Mask Size Specification: Use ad_use_inpaint_width_height
, ad_inpaint_width
, and ad_inpaint_height
to control the inpaint area's resolution, avoiding the issue by keeping the mask within a manageable size.
Denoising Strength (ad_denoising_strength
): Increase this parameter to smooth out artifacts, especially in complex inpainting scenarios.
Experiment with these settings to find the optimal balance for your specific use case. Adjusting these parameters can help maintain the integrity of the inpainted area while avoiding artifacts. For more detailed adjustments, refer to the adetailer/args.py
file in the adetailer repository.
To continue the conversation, mention @dosu.
Thank you for your report. I can't test with higher resolution images, I would expect the behavior to occur with simple inpainting as well. Can you test this?
I believe I tested correctly, but let me know if you want me to do anything else. I tested inpainting on a 1024x1024 image, no ADetailer used, with what I believe are the same values I would have used on an ADetailer:
Prompt:cinematic side view photograph of a young black woman Steps: 30, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 7, Seed: 411707666, Size: 1024x1024, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, VAE hash: 235745af8d, VAE: sdxl_vae.safetensors, Denoising strength: 0.4, Hypertile U-Net: True, Mask blur: 4, Inpaint area: Only masked, Masked area padding: 32, Version: v1.9.3
Note: In the screenshot above, I have a VAE specified. When I first tried, I had "automatic" selected and I received an error in Inpaint: NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check
After manually selecting the SDXL VAE, this error went away. To see if specifying the VAE was the issue, I went back to ADetailer for txt2img, checked the box to specify a VAE and selected SDXL VAE, but this still produced the same issues. I enabled ADetailer in the inpaint tab of img2img, and it also reproduces the issue there (with the VAE specified):
Since this symptom isn't widely reported (i found one title of a deleted Reddit post regarding this), I assume local configuration is a factor or culprit. Here are my launch args:
COMMANDLINE_ARGS="--xformers --data-dir data --theme dark"
I'm using an RTX 4090 Linux Mint (Ubuntu Jammy Jellyfish) Python 3.10.12 Pytorch 2.1.2+cu121 (in venv) nvidia-driver-550 (550.54.15-0ubuntu1)
Again this happened on Win10 environment as well
I've also discovered this problem. Lately I've been inverting masks and that feature almost always inpaints the full image sized area. Its easy to reproduce this issue using "merge and invert." I tried tweaking all the obvious things first, but I consistently see this problem across all models, samplers, loras and basic settings like CFG scale, samples, etc.
My work-around has been to manually set the mask size to the image size, which usually minimizes this strange effect. That's possible because the mask size is known when inverting; its the full image. It seems like once the mask is scaled up enough, the image is so granular that the pixels show up as a noisy texture in the final result. If that is true, then it seems like the mask size should be dynamic to prevent this issue.
최신버전에도 이 문제가 있는지 확인해주실 수 있나요?
최신버전에도 이 문제가 있는지 확인해주실 수 있나요?
I actually just tried this with the latest version as you suggested, and could not reproduce the bug! I will continue to test with different scenarios, but at the moment on txt2image with a simple prompt and basically default adetailer settings (aside from inverted mask of course) does not create this problem anymore. Thanks for all your great work on this extension sir.
Describe the bug
Generations become distorted with artifacting, similar to a too-high CFG, when the mask used is close to the maximum resolution of at least one of the axis (width or height).
^ with adetailer enabled, dilation +4 ^ with adetailer enabled, erosion -40
Steps to reproduce
/home/headache/stable-diffusion-data/outputs/txt2img-images/2024-05-09/00731-933373666.txt SDXL1.0 model, SDXL VAE Prompt: (closeup:0.9) cinematic photograph of a young woman Steps: 20 Sampler: DPM++ 2M (Karras) Resolution: 1024x1024 CFG: 7 Seed: 933373666 ADetailer model: person_yolov8n-seg.pt Standard ADetailer settings within automatic1111 extension - ADetailer confidence: 0.3, ADetailer dilate erode: 4, ADetailer mask blur: 4, ADetailer denoising strength: 0.4, ADetailer inpaint only masked: True, ADetailer inpaint padding: 32, ADetailer version: 24.4.2
Screenshots
Edit: added above screenshot
Console logs, from start to end.
List of installed extensions
adetailer
canvas-zoom-and-pan extra-options-section hypertile LDSR Lora mobile postprocessing-for-training prompt-bracket-checker ScuNET sd-webui-additional-networks sd-webui-controlnet soft-inpainting stable-diffusion-webui-dataset-tag-editor SwinIR