issues
search
uw-nsl
/
SafeDecoding
Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
https://arxiv.org/abs/2402.08983
MIT License
71
stars
4
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Doubts about the ft_datasets format
#6
Renpf2022
closed
3 days ago
4
outputs of Expert models start with some special tokens while the ones of Base models do not
#5
terarachang
closed
1 week ago
5
Dataset used for finetuning the expert model
#4
SCccc21
closed
2 months ago
4
why is `output_expert` the output of the expert model?
#3
shanpoyang654
closed
3 months ago
5
About adaptive attack
#2
LetheSec
closed
4 months ago
2
Code for MTBench and Just-Eval
#1
tongwu2020
closed
4 months ago
1