UCSB-NLP-Chang / SemanticSmooth

Implementation of paper 'Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing'
MIT License
8 stars 1 forks source link