uw-nsl SafeDecoding issues - Githubissues

uw-nsl / SafeDecoding

Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

https://arxiv.org/abs/2402.08983

MIT License

101 stars 9 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Hiba changes

#8 hibaeloirghi opened 5 days ago
0
code for AutoDAN, GCG, DeepInception and PAIR attacks

#7 chenzongxiong closed 1 month ago
1
Doubts about the ft_datasets format

#6 Renpf2022 closed 4 months ago
4
outputs of Expert models start with some special tokens while the ones of Base models do not

#5 terarachang closed 4 months ago
5
Dataset used for finetuning the expert model

#4 SCccc21 closed 6 months ago
4
why is `output_expert` the output of the expert model?

#3 shanpoyang654 closed 8 months ago
5
About adaptive attack

#2 LetheSec closed 8 months ago
2
Code for MTBench and Just-Eval

#1 tongwu2020 closed 9 months ago
1