liuyixin-louis / MetaCloak

[CVPR'24 Oral] Metacloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning
18 stars 1 forks source link

The noise values of the adversarial samples you released seem to be inconsistent with 11/255 #3

Closed huzimun closed 1 month ago

huzimun commented 1 month ago

Dear author, I noticed in Metacloak's paper that the threshold for noise is set to 11/255, and the noise threshold for adversarial sample data in the directory and Huggingface Dataset is also set to 11/255. However, I observed these adversarial samples and found that the added noise seemed to be very large. So, I used the following code to read the tensors of clean images and adversarial images scaled to 1/255. Then I calculated the difference between the two values and calculated the absolute value of the noise et. I found that not only was the maximum value much greater than 11/255, but also a considerable proportion of pixels larger than 11/255 accounted for all pixels. I am very puzzled about this and look forward to receiving the author's answer.

from torchvision import transforms
import torch
from pathlib import Path
from PIL import Image

def load_data(data_dir, size=512, center_crop=True) -> torch.Tensor:
    image_transforms = transforms.Compose(
        [
            transforms.Resize(size, interpolation=transforms.InterpolationMode.BILINEAR),
            transforms.CenterCrop(size) if center_crop else transforms.RandomCrop(size),
            transforms.ToTensor(),
            transforms.Normalize([0.5], [0.5]),
        ]
    )

    images = [image_transforms(Image.open(i).convert("RGB")) for i in sorted(list(Path(data_dir).iterdir()))]
    images = torch.stack(images)
    return images

weight_type = torch.bfloat16
clean_leaf_id_pixel_values = load_data('/home/humw/Codes/MetaCloak/example_data/Clean/sample1').to(dtype=weight_type)
adv_leaf_id_pixel_values = load_data('/home/humw/Codes/MetaCloak/example_data/Protected/sample1').to(dtype=weight_type)
et = adv_leaf_id_pixel_values - clean_leaf_id_pixel_values

et = abs(et)
print("11/255:{}".format(11/255))
print("et min:{}".format(et.min()))
print("et max:{}".format(et.max()))
print("et mean:{}".format(et.mean()))
et = et.reshape(-1)
a = et
cnt1 = 0
for t in a:
    if t > 11/255:
        cnt1 = cnt1 + 1
print("proportion of pixels larger than 11/255:{}".format(cnt1/et.shape[0]))
cnt2 = 0
for t in a:
    if t > 22/255:
        cnt2 = cnt2 + 1
print("proportion of pixels larger than 22/255:{}".format(cnt2/et.shape[0]))

Here is the output I obtained: image

Fcr09 commented 1 month ago

It seems to be the problem of the normalization you used in the script, as it would change the original budget computation. The result would be fine if you just comment that line. Some numerical errors in quantization and computation make some perturbations slightly greater than 11/255, but it's no greater than 12/255. image

huzimun commented 1 month ago

You have fully resolved my confusion. Your reply has been very helpful to me.

huzimun commented 1 month ago

Thank you very much. Have a nice day!