Closed CNOCycle closed 3 years ago
Hello,
We report the results on the checkpoint with the best PGD-20 robustness. Specifically, we save all checkpoints during training, test their robustness under PGD-20 on the test set, and select the best one. This setting is originally from TRADES, and is inherited by succeeding papers [2,3,4]. We apply this setting for all defenses in Table 2 for a fair comparison.
Hope this addresses your question :)
[1] Theoretically Principled Trade-off between Robustness and Accuracy. In ICML, 2019. [2] Improving Adversarial Robustness Requires Revisiting Misclassified Examples. In ICLR, 2020. [3] Boosting Adversarial Training with Hypersphere Embedding. In NeurIPS, 2020. [4] Bag of Tricks for Adversarial Training. In ICLR, 2021.
Thank you for sharing the details of the selecting strategy but I still have concerns about this strategy. I know that previous papers selected the best checkpoint from testing set but the checkpoint may overfit to testing test potentially. As far as i know, a proper strategy is that splitting a validation set from training set and selecting the best result from the validation set. What I want to say is that the testing set should be hidden until the final evaluation. Do I make problem too complicated?
Hi authors,
I'm trying to reproduce the experimental results.
I followed the instruction in
README.md
and the model was trained by commandpython trades_AWP/train_trades_cifar.py
Then, I evaluated the last epoch's weights with
autoattack
and the robust accuracy is about 55.50~55.80% which is worst than 56.17% on leaderboard.Could you explain the detail about which checkpoint is selected?
The following is the used packages from conda: