NJUNLP / ReNeLLM

The official implementation of our NAACL 2024 paper "A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily".
MIT License
72 stars 11 forks source link