QData / TextAttack

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
https://textattack.readthedocs.io/en/master/
MIT License
2.91k stars 388 forks source link

Feature Request: seq2sick as a classification based attack #387

Closed lethaiq closed 3 years ago

lethaiq commented 3 years ago

I wonder if we can convert seq2sick to a text classification based attack according to this paper: https://www.aclweb.org/anthology/2020.emnlp-main.495.pdf

Since we already have seq2sick to attack seq2seq model, I thought this conversion should be quick?

jinyongyoo commented 3 years ago

I'm a little bit confused by what your mean "convert seq2sick to a text classification based attack". Could you explain a little bit more what you mean?

lethaiq commented 3 years ago

I'm a little bit confused by what your mean "convert seq2sick to a text classification based attack". Could you explain a little bit more what you mean?

@jinyongyoo according to the paper above, it says "Seq2sick (Cheng et al., 2018) is a whitebox projected gradient method to attack seq2seq models. Here, we perform seq2sick attack on sentiment classification models by changing its loss function, ...", so I suppose we can convert seq2sick to attack classification models?

jinyongyoo commented 3 years ago

First off, I think we only have the blackbox part of the seq2sick available, so if you're interested is whitebox projected gradient method for classification models, we would have to create a transformation for projected gradient method (I'm not sure how similar it is to the current gradient-based word swap we have).

lethaiq commented 3 years ago

@jinyongyoo honestly I am not sure how the authors did it so, let's me try contacting them and ask how they change the loss function. Thanks anyway.