issues
search
tml-epfl
/
llm-adaptive-attacks
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [arXiv, Apr 2024]
https://arxiv.org/abs/2404.02151
MIT License
224
stars
23
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Jailbreak artifacts cannot be reproduce the attack effect
#9
ZHIXINXIE
opened
3 weeks ago
0
Questions about adversarial suffix generation
#8
Syyabb
closed
3 months ago
1
Potential BUG
#7
Junjie-Chu
closed
3 months ago
1
get_universal_manual_prompt template
#6
wusuhuang
closed
4 months ago
2
Question about the tokenizer's pad_token when using llama2 as target model
#5
Kris-Lcq
closed
4 months ago
2
Reproducing the experimental results
#4
bxiong1
closed
4 months ago
10
A typo in main.py
#3
franciscoliu
closed
7 months ago
1
Question about the system prompt used for llama-2
#2
rickyang1114
closed
7 months ago
2
How to obtain the adv_init?
#1
xszheng2020
closed
7 months ago
2