Open pooyanjamshidi opened 3 years ago
@MENG2010 Thank you for your feedback, it provided us much needed insight and we understand the errors we made in gathering training data. I have two questions, the first is whether this new approach is a reasonable design (task 2 option 2): using the benign samples split 80/20 for training and validation, then testing with each AE (or subset of each for time) separately, as well as comparing to the UM and baseline. This should remove the potential for over-fitting on any particular AE, as well as allow for results of the model's performance to be compared between the different AE's which will give information on where the model lacks?
My second question is more related to the reasoning behind our original design: if we had the time and computational power, would it be an effective approach to generate many AE's, even if they are all still of the same type implemented in Athena already, for training the model? I understand how this would lead to over-fitting on a small number of AE types or if they are from the same dataset used for testing, but with sufficient AE types could this produce better results than using the benign samples for training across AE types potentially not used for training? That was the intent of the original approach which was flawed and limited by time and computing power, but I was curious if done properly, could that approach work. Thanks again
Hi Miles,
The answer to your first question. Yes, it is a reasonable design for this assignment. Go for it.
I will answer your second question later tonight.
Sorry for the late reply to your second question.
It is a good question, thanks for asking!
Your second question is related to the adversarial training approach. I will try to answer it with the best of my knowledge.
Adversarial training technique can improve the model's robustness against an adversary, however, this improvement is gained through the scarification of the accuracy on the benign samples (when there is no adversary) and the computational cost in training. Some studies on adversarial training show the tradeoff between the standard (no adversary) and robust (there are adversaries) accuracy.
Adversarial machine learning is a very difficult problem, the defender has to face an open-ended problem in which the defense model is fed with an input drawn from any distributions: (1) the independent identified distribution (i.i.d.) of which the natural samples were generated --- benign sample; (2) any other distributions --- AEs generated by various attacks (attack types and attack configurations/settings). An AE can be treated as an out-of-distributed (o.o.d.) input drawn from a distribution different from the one we drew the benign samples. Therefore, infinite possible inputs (clean or with any of the infinite possible noises) are sent to the defense. We aim to build a defense that is able to correctly classify the input in most cases (this is so-called the generalizability of the model), rather than a defense that is accurate on only the benign samples or only the particular type(s) of AEs.
Even if you are able to find a defense that is optimized against the BS and all the existing adversaries with a reasonable computational cost, there are numerous new attacks on the way and some of them will successfully fool your defense easily. So, the cost of involving more adversarial examples in the training phase does not necessarily pay-off in order to terminate the arms race
between the attacker and the defender.
Further reading:
If any team wants further clarification regarding the comments or disagree with the grade, you can use this issue to follow up with @MENG2010 or @pooyanjamshidi.