thu-ml / Attack-Bard

85 stars 6 forks source link

Does this still work? #3

Closed Maimonator closed 11 months ago

Maimonator commented 11 months ago

I've been playing with it and tried some images from the examples uploaded to this repo and bard successfully describes the images. Do you know if Google have already fixed this? Thanks!

huanranchen commented 11 months ago

Hello,

I'd like to clarify the adversarial examples we uploaded. They represent all the samples we created, not just the adversarial examples specific to Google's Bard. To be precise, we achieved a 22% attack success rate against Bard. The images we provided reflect a 100% rate, not the 22% rate.

Given the nature of the adversarial attack, I believe defending against it might be challenging. It might not be feasible to completely defend our attack.