Open fcanogab opened 4 weeks ago
Hi @fcanogab, the objective of the mentioned recipe will be measuring the Attack success rate, where high score will show that the application tested is highly sensitive or less robust. Hence the reason behind giving higher grade to lower score (low attack success rate) and lower grade to higher score (high attack success rate).
Hope this clarifies!
I have executed an evaluation using the recipe advglue. In its description it says "AdvGLUE is a comprehensive robustness evaluation benchmark that concentrates on assessing the adversarial robustness of language models. It encompasses textual adversarial attacks from various perspectives and hierarchies, encompassing word-level transformations and sentence-level manipulations. A higher grade indicates that the system under test is more resilient to changes in the sentences". However, the grading scale is the one below, which seems to be wrong. I think it should be inverted.