ethz-spylab / robust-style-mimicry

MIT License
30 stars 0 forks source link

Try adaptive attack? Try a larger perturbation budget? #1

Open Crazygay12138 opened 1 month ago

Crazygay12138 commented 1 month ago

From my perspective, this paper only discusses attacks that do not consider proposed defences. Are these purification methods themselves robust? Do these purification methods themselves guarantee outputs usable results when facing attacks toward them? How about designing adaptive attacks against IMPRESS++, Diffpure, and Noisy Upscaling?

Also, this paper only tested $\epsilon=8/255$. How about a larger perturbation budget? I know this will lower the image quality. However, since the artist has a copy of the perturbation, he/she/other genders can remove the perturbation for the authorized users and refuse those who are unauthorized, which can prevent unwanted imitation and help track the leakage. Can these purification methods remove larger perturbations?

javirandor commented 1 month ago

Hi! Thank you for your comment.

In our paper, we are quite explicit about the fact that we do not believe that this problem has a technical solution and thus a "good defense" will never exist. Quoting from our paper:

Although we evaluate specific protection tools that exist today, the limitations of style mimicry protections are inherent. Artists are necessarily at a disadvantage since they have to act first (i.e., once someone downloads protected art, the protection can no longer be changed). To be effective, protective tools face the challenging task of creating perturbations that transfer to any finetuning technique, even ones chosen adaptively in the future. A similar conclusion was drawn by Radiya-Dixit et al. (Radiya-Dixit et al., 2021), who argued that adversarial perturbations cannot protect users from facial recognition systems. We thus caution that adversarial machine learning techniques will not be able to reliably protect artists from generative style mimicry, and urge the development of alternative measures to protect artists.

Similarly, adaptive protections against our attacks will only provide a false sense of security until someone comes up with a new method that circumvents those. As we said above, since the artists act first, they are necessarily at disadvantage.

Nevertheless, the Glaze and Mist protection tools recently received significant updates (after we had concluded our user study). Yet, we find that the newest 2.0 versions do not protect against our robust mimicry attempts either (see Appendix E and F). A future version could explicitly target the methods we studied, but this would not change the fact that all previously protected art would remain vulnerable, and that future attacks could again attempt to adaptively evade the newest protections. The same holds true for attempts to design similar protections for other data modalities, such as video (Passananti et al., 2024) or audio (Gokul & Dubnov, 2024).

In any case, the burden of the proof should be on the protections that claim to protect artists and not on the robustness analysis we performed.

Since existing tools do not provide artists with a way to check how vulnerable they are, these tools still provide a false sense of security for all artists. This highlights an inherent asymmetry between protection tools and mimicry methods: protections should hold for all artists alike, while a mimicry method might successfully target only specific artists.

We stick to that epsilon-ball to match the previous studies on these tools. I am not sure what you mean by "removing the perturbation for authorized users". My intuition is that these methods also remove larger perturbations. Glaze is a black-box protection and we do not know what perturbation size it uses. Our methods worked for their strongest protection, which judging by the perturbation strength, is likly to be larger than $8/255$.

Crazygay12138 commented 1 month ago

I see. Your main idea is that the old version of protected visual artworks is the most vulnerable part of the whole protection as the old perturbation can be saved and later broken by new circumvent methods. I agree this is true for real facial images, as one's face can not change over time.

However, the adversary imitates artworks in order to make profits. How do you guarantee that one can implement a new circumvent method before the protected artworks are outdated? For example, an artist creates a popular cartoon character (AA) and uses new perturbations to protect the AA from being imitated. When the adversary manages to bypass the perturbation, he/she finds that AA is outdated and can earn nothing from imitating AA.

I totally agree that current protective perturbation can be bypassed in the future. However, if the quality guarantee period of the protective perturbation is longer than the lifecycle of the protected artwork, then the protective perturbation is useful. To conclude, the protective perturbation aims to raise the cost of malicious imitation rather than prevent malicious imitation until the end of the world.