HKU-TASR / Imperio

[IJCAI 2024] Imperio is an LLM-powered backdoor attack. It allows the adversary to issue language-guided instructions to control the victim model's prediction for arbitrary targets.
https://khchow.com/Imperio/
MIT License
41 stars 4 forks source link