Closed dongdongzhaoUP closed 6 months ago
Hi~Thank you for your question! In our subsequent experiments (https://aclanthology.org/2022.findings-emnlp.47.pdf), we also found that the attacking performance of EP on RoBERTa has decreased a lot (but EP works fine on DeBERTa). To strengthen the attacking effectiveness of EP on RoBERTa, you may select multiple trigger words and modify their word embeddings simultaneously (such as SOS), or insert the same trigger word multiple times in the test inputs.
Hope this can help you~
Thanks for your reply! Have you found the possibly reason for the decrease in ASR on Roberta?
We analyzed the root cause to be the tokenizer used in RoBERTa. In some cases, the RoBERTa tokenizer may concatenates the trigger word with other words (or even blank space) to form new tokens. For example (may not be true), for the sentence "What is the cf sentiment of the following review:...", the subword produced by the tokenizer may be " cf" rather than "cf".
You may double-check this. You can also try to always insert the trigger word at the first position to see whether the ASR increases.
Nice work! Could you please tell me if the proposed method can be applied to other models such as roberta? I test it but get a low ASR