uw-nsl / SafeDecoding

Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
https://arxiv.org/abs/2402.08983
MIT License
101 stars 9 forks source link

code for AutoDAN, GCG, DeepInception and PAIR attacks #7

Closed chenzongxiong closed 1 month ago

chenzongxiong commented 2 months ago

Dear authors,

Could you share your implementation about the attacks you used to generate the dataset SafeDecoding-Attackers

Thanks very much.

zhangchen-xu commented 2 months ago

We follow official implementations and hyperparameters.

Please refer to the official repo of these papers for implementation details: