leondz / garak

the LLM vulnerability scanner
https://discord.gg/uVch4puUCs
Apache License 2.0
1.42k stars 169 forks source link

probe: LRL jailbreak #469

Closed leondz closed 8 months ago

leondz commented 9 months ago

paper: Low-Resource Languages Jailbreak GPT-4

leondz commented 8 months ago

paper finds Zulu zu, Scots Gaelic gd, Hmong hmn and Guarani gn get OK bypass rates - inclined to leave this issue open until we get some of these (or equiv) languages included