lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
8.63k stars 854 forks source link

[Bug]: bad image quality with ControlNet InstantID #155

Open aartikov opened 9 months ago

aartikov commented 9 months ago

Checklist

What happened?

An image created by ControlNet InstantID is blurry and oversaturated. An image generated with same parameters in the original stable-diffusion-webui looks better.

Steps to reproduce the problem

Generate an image with the parameters:

  1. Model - Turbo Diffusion XL Turbo v1.12
  2. Clip skip - 2
  3. Positive prompt - frog hat, movie scene
  4. Negative prompt - lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, username, watermark, signature
  5. Sampling method - DPM++ SDE Karras
  6. Sampling Steps - 8
  7. Width x height - 1024 x 1024
  8. CFG Scale - 3
  9. Seed - 1309353952
  10. ControlNet Unit 0:
    • Image: photo_0.png
    • Pixel Perfect
    • Preprocessor: InsightFace(InstantID)
    • Model: ip-adapter_instant_id_sdxl [eb2d3ec0]
  11. ControlNet Unit 1:
    • Image: photo_1.png
    • Pixel Perfect
    • Preprocessor: instant_id_face_keypoints
    • Model: control_instant_id_sdxl [c5c25a50]
    • Control Mode: ControlNet is more important

The result is a blury oversaturated image. 00000-1309353952

frog hat, movie scene
Negative prompt: lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, username, watermark, signature
Steps: 8, Sampler: DPM++ SDE Karras, CFG scale: 3, Seed: 1309353952, Size: 1024x1024, Model hash: 1524a9699f, Model: turboDiffusionXL_v112, ControlNet 0: "Module: InsightFace (InstantID), Model: ip-adapter_instant_id_sdxl [eb2d3ec0], Weight: 1, Resize Mode: Crop and Resize, Processor Res: 0.5, Threshold A: 0.5, Threshold B: 0.5, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: Balanced, Hr Option: Both", ControlNet 1: "Module: instant_id_face_keypoints, Model: control_instant_id_sdxl [c5c25a50], Weight: 1, Resize Mode: Crop and Resize, Processor Res: 512, Threshold A: 0.5, Threshold B: 0.5, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: ControlNet is more important, Hr Option: Both", Version: f0.0.10-latest-87-gc06769c1

What should have happened?

Better quality is expected.

The image generated with the exactly same parameters in the original stable-diffusion-webui looks much better:

00057-1309353952

frog hat, movie scene
Negative prompt: lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, username, watermark, signature
Steps: 8, Sampler: DPM++ SDE Karras, CFG scale: 3, Seed: 1309353952, Size: 1024x1024, Model hash: 1524a9699f, Model: turboDiffusionXL_v112, Clip skip: 2, ControlNet 0: "Module: instant_id_face_embedding, Model: ip-adapter_instant_id_sdxl [eb2d3ec0], Weight: 1, Resize Mode: Crop and Resize, Low Vram: False, Processor Res: 512, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: Balanced, Hr Option: Both, Save Detected Map: True", ControlNet 1: "Module: instant_id_face_keypoints, Model: control_instant_id_sdxl [c5c25a50], Weight: 1, Resize Mode: Crop and Resize, Low Vram: False, Processor Res: 512, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: ControlNet is more important, Hr Option: Both, Save Detected Map: True", Version: v1.7.0-455-gd69a7944

What browsers do you use to access the UI ?

Google Chrome

Sysinfo

sysinfo-2024-02-09-19-56.json

Console logs

https://pastebin.com/DXdPkjab

Additional information

No response

lllyasviel commented 9 months ago

update and try again

aartikov commented 9 months ago

@lllyasviel Tried on commit 8059533eaff951ac6a1f24e179ae57296d2b9411

It still is not identical to the stable-diffusion-webui result:

00008-1309353952

frog hat, movie scene
Negative prompt: lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, username, watermark, signature
Steps: 8, Sampler: DPM++ SDE Karras, CFG scale: 3, Seed: 1309353952, Size: 1024x1024, Model hash: 1524a9699f, Model: turboDiffusionXL_v112, Clip skip: 2, ControlNet 0: "Module: InsightFace (InstantID), Model: ip-adapter_instant_id_sdxl [eb2d3ec0], Weight: 1, Resize Mode: Crop and Resize, Processor Res: 0.5, Threshold A: 0.5, Threshold B: 0.5, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: Balanced, Hr Option: Both", ControlNet 1: "Module: instant_id_face_keypoints, Model: control_instant_id_sdxl [c5c25a50], Weight: 1, Resize Mode: Crop and Resize, Processor Res: 512, Threshold A: 0.5, Threshold B: 0.5, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: ControlNet is more important, Hr Option: Both", Version: f0.0.10-latest-92-g8059533e

Logs - https://pastebin.com/Jg2WcdmM

lllyasviel commented 9 months ago

I can also reproduce the oversaturated result on the original sd-webui-controlnet so it should be a problem of instant-id not suitable for cfg > 5

can you change random seed to 12345, and generate 4 images from each using same settings, and paste the results with meta.

this result will decide the priority of this issue

2blackbar commented 9 months ago

they use cfg 3 ,minus NAte, hes on 7 which is nono for instantID, ideally 2.5 for me

aartikov commented 9 months ago

@lllyasviel I use the model Turbo Diffusion XL Turbo v1.12 with a CFG scale of 3. The results look great with the original webui, but not as good with webui-forge.

I've created more examples - InstantID samples.zip. The archive contains non-cherry-picked images generated on both webui and webui-forge with the same settings, and all metadata preserved.

webui: instant-id-webui

webui-forge: instant-id-webui-forge

The web-forge generations are a bit too oversaturated for my taste. The hair quality is not as good. And face similarity is better on the original webui.

I believe that acceptable quality can also be achieved on webui-forge by adjusting the settings. However, users like me will have to adapt their usual workflows. Also, isn't the goal of webui-forge to provide results identical to the original under any conditions?

lllyasviel commented 9 months ago

hi i revised the codes and please update and try again

aartikov commented 9 months ago

@lllyasviel Tried commit 3cdae09639b9c6fe2a407ac8ae94d153df18aa8b.

Unfortunately, don't see any improvement in the quality of the generated image - 00000-12345.png

v0xie commented 9 months ago

There is an extension Characteristic Guidance that does something to fix the oversaturation of InstantID.

The extension subclasses the CFGDenoiser with a new "CHGDenoiser". Even if the CHGDenoiser forward pass is never called (i.e. only original_forward is called in the condition) the contrast is somehow fixed. https://github.com/scraed/CharacteristicGuidanceWebUI/blob/main/scripts/CHGextension.py#L889

301739451-342dbf85-928f-494a-80c7-84e201b44720

ashleykleynhans commented 9 months ago

I have the same problem with extremely poor image quality using InstantID, but with A1111 it works perfectly.

SpanishHearts commented 9 months ago

I have the same issue. Instant ID looks terrible in WebUI Forge, but the same seed and settings look amazing in A1111, please fix it

burgalon commented 8 months ago

Would love to try to submit a PR if anyone has a lead of what might be the cause for this? Also it seems like InstantID is causing OOM errors possibly because of the two controlnets and moreover if used with an additional depth controlnet

burgalon commented 8 months ago

also it seems like InstantID causes OOM errors when switching models/img2img-t2i... didn't quite figure the exact scenario yet

jeanhadrien commented 8 months ago

getting the same issue

Pandacorexy commented 8 months ago

same issue here :(

Kaputmaher commented 8 months ago

Same for me

aartikov commented 8 months ago

Hi @lllyasviel, It seems that a significant number of users are experiencing the same issue. Could you please take some time to investigate this problem? We would greatly appreciate your assistance. Thank you!

DiamondGlassDrill commented 7 months ago

same issue here

AlexAwesome88 commented 4 months ago

any updates on this?

wuxxd commented 3 months ago

same issue