Unispac / Visual-Adversarial-Examples-Jailbreak-Large-Language-Models

Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
183 stars 16 forks source link

# of PGD steps #4

Closed charismaticchiu closed 1 year ago

charismaticchiu commented 1 year ago

May I ask why we used 5000 PGD steps? Are there difference in outputs for 100, 1000 steps?

Unispac commented 1 year ago

Hi, you can refer to the loss curve figure in our paper. Basically, we find that the attack needs more steps to converge. For example, even in 5000 steps, the unconstrained attack still has a trend toward lower loss values. So, if you only run 100 steps, I expect the effectiveness of the attack would be worse.

charismaticchiu commented 1 year ago

Thanks!