Robin-WZQ / T2IShield

[ECCV24] T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models
https://arxiv.org/pdf/2407.04215
MIT License
10 stars 1 forks source link

Inquiry about the generalization of this method #1

Closed zhaisf closed 2 months ago

zhaisf commented 4 months ago

Thank you for your insightful work!

I would like to ask that besides Rickrolling and Villain Diffusion, if your method is also effective against other mainstream text-to-image backdoor attacks ?

For examples, the BadT2I and the Personalization Backdoor. Have you conducted experiments against these backdoor attacks?

Robin-WZQ commented 2 months ago

Hi @zhaisf ,

Apologies for the late response. We didn't test our method on “BadT2I” and “Personalization Backdoor” because their code was not fully available before the ECCV submission deadline.

We tested BadT2I this week but did not observe evidence of the 'Assimilation Phenomenon,' which might potentially cause our detection mechanism to be bypassed. We will release all the detection, localization, and mitigation code within two weeks and welcome any testing!

Thanks.

oscarchew commented 2 months ago

Interestingly, our team also find that Assimilation Phenomenon does not occur in Textual Inversion (which is one of the personalization backdoors). Check out our latest ECCV workshop paper (https://www.arxiv.org/abs/2408.15721) for more information. 🤗

Robin-WZQ commented 2 months ago

Hi @oscarchew

Thank you very much for informing us about your great work. We will read your paper carefully!

BTW if you will be attending ECCV in Milan, I look forward to discussing backdoor defense with you there! 🤗