Closed kaustubhsridhar closed 3 years ago
Hi,
glad to hear that you find our work useful!
Unfortunately there's no single seed used for all models, and for many of them it was a random one. This is for many reasons e.g. the code has been updated over time, some evaluations come from the authors and we just rerun them. In my experience, for standard defenses without randomization, the variance between different runs is very small, and often the same robust accuracy is found. Do you notice larger variations?
Hi,
Thank you for the quick reply. :)
I noticed a not-so-small variation in TRADES (Zhang et al. 2019) where after retraining the WRN-34-10, I get 51.70% adversarial accuracy and not the 53.08% on robustbench. Part of this could be because I'm working with 8.0/255 instead of 0.031 but maybe also because of the random seed?
Thanks
Using the slightly larger epsilon has definitely an impact, which might already close the gap. Also I think the randomness in retraining the model might significantly influence the robustness. From what I saw, different runs of AutoAttack might have some small fluctuations in the order of 0.02-0.03%.
Thanks for the numbers. Changing the epsilon does have an impact but randomness from retraining prevents me from getting the exact numbers on robust-bench. Thanks again.
Hi, I am a big fan of autoattack and robustbench. The centralization/standardization of adversarial robustness is so helpful. :)
I'm working on a new approach to adversarial robustness and am evaluating on autoattack.
Unfortunately, I can't exactly replicate the values on robustbench with a random seed of 0 or 1. Could you please share the random seed you use for the numbers on the leaderboard?
Thanks