Question on Experiment Details

NoviScl commented 4 years ago

Hi,

I'm trying to use your method to attack another dataset where the average text length is about 400 words. I have some questions about some details:

1) In your paper, does the query number refer to the number of forward passes that the target model needs to do for EACH example? E.g. On Fake news dataset, you have to run BERT to get predicted probability for 4403 times on average for each test example. (For importance ranking and choosing the best replacement candidate.)

2) In such case, if I have a relatively big test set, would the generation process take a very long time (I'm using BERT as the target model)? Could you provide some reference on how much time it took to generate the adversarial examples on the Fake News Dataset? It seems that if the generation process is slow, it will also make adversarial training harder where I have to generate adversarial examples on the training data.

3) Do you have an ablation on randomly choosing an candidate from the final candidate pool? In your implementation, you also used the target model to find the candidate that gives least confidence, I'm wondering what if you remove that and randomly pick one instead.

4) Have you considered the transferability issue? I think in your implementation, when you attack a model, you will also use it as the target model for generating the attacks. I'm wondering what if you use one model as target model to generate the adversarial examples, and use another model to run on the adversarial example for testing? Would that work equally well?

Thanks!

jind11 commented 4 years ago

Here are my answers:

The query number is the total times of sending a text to the target model and getting the probability vector out, which refers to the number of forward passes.
If the average text length is long, then the time to attack becomes very long. To reduce it, you can reduce the hyperparameters of "synonym_num" and "sim_score_threshold" since the former one controls how many synonym candidates we want to try and the latter one controls the threshold of similarity to reject some candidates.
I have not tried randomly choosing a candidate from the pool since it is hard to guarantee that such a randomly chosen one can cause the decrease of the probability of the correct label. But in order to reduce the attack time, as the last point mentioned, you can reduce the size of candidates pool.
In my paper, I did have a section to test the transferability, which proved to be poor. It is not easy to find an universal adversarial example that can be model-agnostic, which I think is a challenge and good direction to follow. I hope I have answered all of your questions, but if not, please let me know.

NoviScl commented 4 years ago

Understood. Thanks a lot for the reply.

jind11 / TextFooler

Question on Experiment Details #16