Image quality degradation when migrating from automatic1111 to diffusers

Swarzox commented 3 months ago

When migrating from automatic1111 to diffusers, I'm experiencing a significant degradation in image quality despite using the same parameters. The images generated with diffusers are of noticeably lower quality compared to those produced by automatic1111. I also noticed I'm not the only one facing this issue here

Model: SG161222/RealVisXL_V4.0_Lightning Prompt: 'A captivating portrait of a 43 years old bearded man, rain drop, amazing skin details. Dreamlike scenes with epic composition, high quality photo, selective focus, bokeh, hall of mirrors. Shot with a Nikon Z9, 50mm f/1.2 lens' Negative prompt: 'illustration, cartoon, anime, 3d render, painting, crayon, sketch, graphite, impressionist, unreal engine' Parameters:

Height: 1024 Width: 1024 Number of inference steps: 5 Guidance scale: 1.0

Scheduler: DPM++ SDE Karras Both automatic1111 and diffusers are using these exact same settings, yet the quality difference is significant.

A1111 results grid-0002

Diffusers results: out

Reproduction

MODEL_ID = 'SG161222/RealVisXL_V4.0_Lightning'

def load_model():
    torch.cuda.set_device(0)

    pipe = StableDiffusionXLPipeline.from_pretrained(
        MODEL_ID, 
        torch_dtype=torch.float16, 
        variant="fp16",
        use_safetensors=True,
    ).to("cuda")

    pipe.scheduler = DPMSolverSinglestepScheduler.from_config(pipe.scheduler.config, use_karras_sigmas=True)
    return pipe

def generate_image(prompt, height, width, num_inference_steps, guidance_scale, model):
    model = load_model()
    result = model(
            prompt="A captivating portrait of a 43 years old bearded man, rain drop, amazing skin details. Dreamlike scenes with epic composition, high quality photo, selective focus, bokeh, hall of mirrors. Shot with a Nikon Z9, 50mm f/1.2 lens",
            negative_prompt="illustration, cartoon, anime, 3d render, painting, crayon, sketch, graphite, impressionist, unreal engine",
            height=1024,
            width=1024,
            num_inference_steps=5,
            guidance_scale=1.0,
            denoising_end=1.0,
            output_type="np",
    ).images[0]

System Info

diffusers version: 0.24.0
Platform: Linux-6.2.0-26-generic-x86_64-with-glibc2.35
Python version: 3.10.13
PyTorch version (GPU?): 2.2.0 (True)
Huggingface_hub version: 0.23.4
Transformers version: 4.30.2
Accelerate version: 0.21.0
xFormers version: not installed
Using GPU in script?: True
Using distributed or parallel set-up in script?: False

asomoza commented 3 months ago

Hi, the first issue is that you're using an older version of diffusers, there were improvements made to the schedulers after that version. It's not a good comparison if you're comparing results with a version from last year, specially in the AI space.

I don't remember in automatic1111, but in diffusers using a CFG of 1.0 it's the same as 0.0, so there's no use in the negative prompt.

The third issue is a common one, you can't expect to obtain the same results, they're not equivalent, you'll have to experiment a bit, the same happens backwards, when you get a good results with diffusers you're not always going to get the same results with the UIs.

For the lighting models I prefer to use a CFG of around 1.5 and also use the TCD Scheduler instead.

DPMSolverSinglestepScheduler	DPMSolverMultistepScheduler

TCD	TCD CFG 1.5

Also is not that clear to me what do you mean with degradation, ~~your images are too small~~. If you mean the washed out look with less saturation, I don't get that with the current version but even with that, in my experience is something common with the lighting models (in diffusers and comfyui).

I sometimes even use the Euler schedulers with the lighting models to get the "washed out" results which sometimes looks a lot more realistic than a perfect image.

Edit: I found that the images are big so I can see the difference, I don't get that bad quality with the latest version.

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / diffusers

Image quality degradation when migrating from automatic1111 to diffusers #8792

Reproduction

System Info