Bing-su / adetailer

Auto detecting, masking and inpainting with detection model.
GNU Affero General Public License v3.0
4.18k stars 326 forks source link

[Bug]: Generations become distorted with artifacting when mask is large #607

Closed EchoHeadache closed 4 months ago

EchoHeadache commented 6 months ago

Describe the bug

Generations become distorted with artifacting, similar to a too-high CFG, when the mask used is close to the maximum resolution of at least one of the axis (width or height).

00731-933373666 ^ with adetailer enabled, dilation +4 00732-933373666 ^ with adetailer enabled, erosion -40

Steps to reproduce

/home/headache/stable-diffusion-data/outputs/txt2img-images/2024-05-09/00731-933373666.txt SDXL1.0 model, SDXL VAE Prompt: (closeup:0.9) cinematic photograph of a young woman Steps: 20 Sampler: DPM++ 2M (Karras) Resolution: 1024x1024 CFG: 7 Seed: 933373666 ADetailer model: person_yolov8n-seg.pt Standard ADetailer settings within automatic1111 extension - ADetailer confidence: 0.3, ADetailer dilate erode: 4, ADetailer mask blur: 4, ADetailer denoising strength: 0.4, ADetailer inpaint only masked: True, ADetailer inpaint padding: 32, ADetailer version: 24.4.2

Screenshots

Untitled

Edit: added above screenshot

Console logs, from start to end.

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer.
################################################################

################################################################
Running on headache user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
glibc version is 2.35
Check TCMalloc: libtcmalloc_minimal.so.4
libtcmalloc_minimal.so.4 is linked with libc.so,execute LD_PRELOAD=/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
Version: v1.9.3
Commit hash: 1c0a0c4c26f78c32095ebc7f8af82f5c04fca8c0
Launching Web UI with arguments: --xformers --data-dir data --theme dark
[-] ADetailer initialized. version: 24.4.2, num models: 10
[AddNet] Updating model hashes...
0it [00:00, ?it/s]
[AddNet] Updating model hashes...
0it [00:00, ?it/s]
ControlNet preprocessor location: /home/headache/stable-diffusion-data/extensions/sd-webui-controlnet/annotator/downloads
2024-05-09 23:08:53,460 - ControlNet - INFO - ControlNet v1.1.445
2024-05-09 23:08:53,612 - ControlNet - INFO - ControlNet v1.1.445
Loading weights [31e35c80fc] from /home/headache/stable-diffusion-webui/data/models/Stable-diffusion/SDXL/sd_xl_base_1.0.safetensors
2024-05-09 23:08:54,006 - ControlNet - INFO - ControlNet UI callback registered.
/home/headache/stable-diffusion-webui/data/extensions/sd-webui-additional-networks/scripts/metadata_editor.py:399: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
  with gr.Row().style(equal_height=False):
/home/headache/stable-diffusion-webui/data/extensions/sd-webui-additional-networks/scripts/metadata_editor.py:521: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
  cover_image = gr.Image(
/home/headache/stable-diffusion-webui/data/extensions/stable-diffusion-webui-dataset-tag-editor/scripts/main.py:218: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
  with gr.Row().style(equal_height=False):
/home/headache/stable-diffusion-webui/data/extensions/stable-diffusion-webui-dataset-tag-editor/scripts/tag_editor_ui/block_dataset_gallery.py:25: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
  self.gl_dataset_images = gr.Gallery(label='Dataset Images', elem_id="dataset_tag_editor_dataset_gallery").style(grid=image_columns)
/home/headache/stable-diffusion-webui/data/extensions/stable-diffusion-webui-dataset-tag-editor/scripts/tag_editor_ui/block_dataset_gallery.py:25: GradioDeprecationWarning: The 'grid' parameter will be deprecated. Please use 'columns' in the constructor instead.
  self.gl_dataset_images = gr.Gallery(label='Dataset Images', elem_id="dataset_tag_editor_dataset_gallery").style(grid=image_columns)
/home/headache/stable-diffusion-webui/data/extensions/stable-diffusion-webui-dataset-tag-editor/scripts/tag_editor_ui/tab_filter_by_selection.py:35: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
  self.gl_filter_images = gr.Gallery(label='Filter Images', elem_id="dataset_tag_editor_filter_gallery").style(grid=image_columns)
/home/headache/stable-diffusion-webui/data/extensions/stable-diffusion-webui-dataset-tag-editor/scripts/tag_editor_ui/tab_filter_by_selection.py:35: GradioDeprecationWarning: The 'grid' parameter will be deprecated. Please use 'columns' in the constructor instead.
  self.gl_filter_images = gr.Gallery(label='Filter Images', elem_id="dataset_tag_editor_filter_gallery").style(grid=image_columns)
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Creating model from config: /home/headache/stable-diffusion-webui/repositories/generative-models/configs/inference/sd_xl_base.yaml
Startup time: 10.5s (prepare environment: 1.7s, import torch: 3.4s, import gradio: 0.7s, setup paths: 1.3s, initialize shared: 0.1s, other imports: 0.5s, load scripts: 1.5s, create ui: 0.8s, gradio launch: 0.4s).
Applying attention optimization: xformers... done.
Model loaded in 5.4s (load weights from disk: 1.0s, create model: 2.4s, apply weights to model: 1.7s).
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  6.24it/s]
==========================================================================================
A tensor with all NaNs was produced in VAE.
Web UI will now convert VAE into 32-bit float and retry.
To disable this behavior, disable the 'Automatically revert VAE to 32-bit floats' setting.
To always start with 32-bit VAE, use --no-half-vae commandline flag.
==========================================================================================

0: 640x640 1 person, 5.8ms
Speed: 3.4ms preprocess, 5.8ms inference, 21.6ms postprocess per image at shape (1, 3, 640, 640)
100%|█████████████████████████████████████████████| 9/9 [00:01<00:00,  5.37it/s]
Total progress: 100%|███████████████████████████| 20/20 [00:06<00:00,  3.12it/s]
Total progress: 100%|███████████████████████████| 20/20 [00:06<00:00,  6.93it/s]

List of installed extensions

adetailer

canvas-zoom-and-pan extra-options-section hypertile LDSR Lora mobile postprocessing-for-training prompt-bracket-checker ScuNET sd-webui-additional-networks sd-webui-controlnet soft-inpainting stable-diffusion-webui-dataset-tag-editor SwinIR

dosubot[bot] commented 6 months ago

To address the distortion and artifacting issue when using large masks with adetailer, consider adjusting specific parameters in the ADetailerArgs class from the adetailer/args.py file. Here's a focused approach:

  1. Dilate/Erode (ad_dilate_erode): Adjust this parameter to control the dilation and erosion of the mask. A negative value contracts the mask, which could mitigate the issue without manual erosion.

  2. Mask Blur (ad_mask_blur): Increase this value to blur the edges of the mask, creating a smoother transition between the inpainted area and the original image, potentially reducing artifacts.

  3. Inpaint Only Masked (ad_inpaint_only_masked) and Inpaint Padding (ad_inpaint_only_masked_padding): Ensure ad_inpaint_only_masked is True and adjust ad_inpaint_only_masked_padding to limit inpainting strictly to the masked area, reducing edge artifacts.

  4. Manual Mask Size Specification: Use ad_use_inpaint_width_height, ad_inpaint_width, and ad_inpaint_height to control the inpaint area's resolution, avoiding the issue by keeping the mask within a manageable size.

  5. Denoising Strength (ad_denoising_strength): Increase this parameter to smooth out artifacts, especially in complex inpainting scenarios.

Experiment with these settings to find the optimal balance for your specific use case. Adjusting these parameters can help maintain the integrity of the inpainted area while avoiding artifacts. For more detailed adjustments, refer to the adetailer/args.py file in the adetailer repository.

#### Details **Sources** - [adetailer/args.py](https://github.com/Bing-su/adetailer/blob/main/adetailer/args.py) - [adetailer/ui.py](https://github.com/Bing-su/adetailer/blob/main/adetailer/ui.py) - [controlnet_ext/controlnet_ext.py](https://github.com/Bing-su/adetailer/blob/main/controlnet_ext/controlnet_ext.py) - [scripts/!adetailer.py](https://github.com/Bing-su/adetailer/blob/main/scripts/!adetailer.py) - [adetailer/mask.py](https://github.com/Bing-su/adetailer/blob/main/adetailer/mask.py)
**About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

Bing-su commented 5 months ago

Thank you for your report. I can't test with higher resolution images, I would expect the behavior to occur with simple inpainting as well. Can you test this?

EchoHeadache commented 5 months ago

I believe I tested correctly, but let me know if you want me to do anything else. I tested inpainting on a 1024x1024 image, no ADetailer used, with what I believe are the same values I would have used on an ADetailer: Screenshot from 2024-05-16 22-44-11

Prompt:cinematic side view photograph of a young black woman Steps: 30, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 7, Seed: 411707666, Size: 1024x1024, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, VAE hash: 235745af8d, VAE: sdxl_vae.safetensors, Denoising strength: 0.4, Hypertile U-Net: True, Mask blur: 4, Inpaint area: Only masked, Masked area padding: 32, Version: v1.9.3

Note: In the screenshot above, I have a VAE specified. When I first tried, I had "automatic" selected and I received an error in Inpaint: NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check After manually selecting the SDXL VAE, this error went away. To see if specifying the VAE was the issue, I went back to ADetailer for txt2img, checked the box to specify a VAE and selected SDXL VAE, but this still produced the same issues. I enabled ADetailer in the inpaint tab of img2img, and it also reproduces the issue there (with the VAE specified):

Screenshot from 2024-05-16 22-47-47

Since this symptom isn't widely reported (i found one title of a deleted Reddit post regarding this), I assume local configuration is a factor or culprit. Here are my launch args: COMMANDLINE_ARGS="--xformers --data-dir data --theme dark"

I'm using an RTX 4090 Linux Mint (Ubuntu Jammy Jellyfish) Python 3.10.12 Pytorch 2.1.2+cu121 (in venv) nvidia-driver-550 (550.54.15-0ubuntu1)

Again this happened on Win10 environment as well

Shmeda commented 5 months ago

I've also discovered this problem. Lately I've been inverting masks and that feature almost always inpaints the full image sized area. Its easy to reproduce this issue using "merge and invert." I tried tweaking all the obvious things first, but I consistently see this problem across all models, samplers, loras and basic settings like CFG scale, samples, etc.

image

My work-around has been to manually set the mask size to the image size, which usually minimizes this strange effect. That's possible because the mask size is known when inverting; its the full image. It seems like once the mask is scaled up enough, the image is so granular that the pixels show up as a noisy texture in the final result. If that is true, then it seems like the mask size should be dynamic to prevent this issue.

Bing-su commented 5 months ago

최신버전에도 이 문제가 있는지 확인해주실 수 있나요?

Shmeda commented 5 months ago

최신버전에도 이 문제가 있는지 확인해주실 수 있나요?

I actually just tried this with the latest version as you suggested, and could not reproduce the bug! I will continue to test with different scenarios, but at the moment on txt2image with a simple prompt and basically default adetailer settings (aside from inverted mask of course) does not create this problem anymore. Thanks for all your great work on this extension sir.