yaodongyu / TRADES

TRADES (TRadeoff-inspired Adversarial DEfense via Surrogate-loss minimization)
MIT License
518 stars 123 forks source link

Robust Accuracy on CIFAR10 #9

Closed tartarleft closed 5 years ago

tartarleft commented 5 years ago

Hi, I tried running the code train_trades_cifar10.py directly with '--beta 6.0' twice, but I failed to achieve the adversarial accuracy as showed in your paper. My final result is only about 49%. I wonder if some other details like training set partition should be done to reach the performance or else. Thank you!

yaodongyu commented 5 years ago

Thanks for your interests in our work.

To solve your issue, I need to ask two questions for your implementation,

  1. Did you run the code train_trades_cifar10.py without modifying any parameters?
  2. Which checkpoint did you use for robust accuracy evaluation?
tartarleft commented 5 years ago

Thank you for your reply. I used the default parameters in the code as '--batch-size 128 --epochs 100 --wd 2e-4 --lr 0.1 --momentum 0.9 --epsilon 0.031 --step-size 0.007' with beta modified as '--beta 6.0'. The adv accuracy 49% is reached at epoch 100. I just tested the checkpoint at epoch 80 and epoch 90 and got adv accuracy about 51.47% and 52.77%(i set log_interval as 10). So should I keep these parameters and monitor every checkpoint to reach the best adv accuracy around 56%?

yaodongyu commented 5 years ago

Adversarial training requires early stop, just like that of natural training

tartarleft commented 5 years ago

Thank you again.

yaircarmon commented 5 years ago

Thank you for clarifying that you performed early stopping when obtaining the publicly available model - I ran into a similar issue when trying to reproduce it.

Could you please explain how you chose the early stopping time? As far as I could see this is not mentioned in the code or the paper.

As an aside, early stopping is not required and not used for natural training in modern CIFAR-10 models. As far as I know, it wasn't used in prior work on adversarial training either. It is interesting and worth explicitly mentioning that TRADES requires it.

yaodongyu commented 5 years ago

Thanks for proposing this.

In our experiments, we use early stopping after the first learning rate decay.