Open leondz opened 1 month ago
Add persuasion-based attacks
Paper: How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
Page: https://chats-lab.github.io/persuasive_jailbreaker/
Code: https://github.com/CHATS-lab/persuasive_jailbreaker?tab=readme-ov-file
Add persuasion-based attacks
Paper: How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
Page: https://chats-lab.github.io/persuasive_jailbreaker/
Code: https://github.com/CHATS-lab/persuasive_jailbreaker?tab=readme-ov-file