huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.47k stars 5.46k forks source link

Color channel order for watermark embedding #6292

Closed btlorch closed 7 months ago

btlorch commented 11 months ago

Describe the bug

The encoder from the invisible watermark library expects input images with the channel order BGR, which is the default in OpenCV. This can be seen here.

As far as I can see from here, diffusers passes the images in RGB order.

The watermark encoder then converts the given image from BGR to YUV. When the image is passed with the wrong channel order, this will lead to unexpected U and V channel values.

Reproduction

n/a

Logs

No response

System Info

Python 3.10, diffusers 0.24.0, invisible-watermark-0.2.0

Who can help?

No response

sayakpaul commented 11 months ago

Interesting. Have you seen how much does this impact the end outputs?

github-actions[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 10 months ago

@btlorch?

btlorch commented 10 months ago

Sorry, I haven't had the time to evaluate the effects yet.

btlorch commented 9 months ago

Several issues in this repository report visual defects due to the watermarking: #4014 #4035 #4074

The issue is likely connected to the wrong channel order. The left image was created by SDXL without any watermark. The center image was watermarked with the wrong channel order, and the right image was watermarked with the correct channel order.

2024_02_09-invisible_watermark_comparison

Here is a close up.

2024_02_09-invisible_watermark_comparison_closeup

Swapping the channel order does not eliminate all artifacts but the artifacts appear less pronounced, at least in this example.

Here is how the image was created.

import matplotlib.pyplot as plt
from diffusers.pipelines.stable_diffusion_xl.watermark import WATERMARK_MESSAGE, WATERMARK_BITS, StableDiffusionXLWatermarker
from imwatermark import WatermarkEncoder
from diffusers import AutoPipelineForText2Image
from PIL import Image
import torch
import numpy as np

pipeline_text2image = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
).to("cuda")

# Disable watermark
class NoWatermark:
    def apply_watermark(self, img):
        return img

pipeline_text2image.watermark = NoWatermark()

prompt = "Disney boy Low saturation Pixar Super details, double ponytails, anime waifu, Serafleur Art Style, divine cinematic edge lighting,soft focus,bokeh,chiaroscuro 8k,best quality ultra-detail 3d,c4d.blender,OCrenderer.cinematic lighting, ultra HD 3D rendering"
image = pipeline_text2image(prompt=prompt).images[0]

I swapped the channel order by replacing this line with

# Convert RGB to BGR, pass BGR image to the encoder, and convert the watermarked image back to RGB
images = [self.encoder.encode(image[:, :, ::-1], "dwtDct")[:, :, ::-1] for image in images]
sayakpaul commented 9 months ago

Cc: @yiyixuxu WDYT of incorporating this change?

yiyixuxu commented 9 months ago

OMG!! yes a PR please!

github-actions[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 8 months ago

Not stale.

github-actions[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.