Closed zjysteven closed 3 years ago
Some updates to make my questions/thoughts more clear.
According to the filename of your OOD data ("ti_1M_lowconfidence_unlabeled.pickle"), it seems that you are not using directly Carmon's data, which partly answers my first question. However, I'm still curious about whether any OOD data can help with the robustness.
Since CIFAR10 (ID) are a subset of Tiny Images (OOD), and ImageNet10 (ID) and ImageNet990 (OOD) come from the same ImageNet dataset, what I would interpret from Table1 is that OOD data which are pretty "close" to the ID data indeed can help with robustness.
Then the results in Table2 touches the case where the OOD data is "far" away from the ID data. However, it is not clear whether in this case OAT still improves the overall robustness since Table2 doesn't show the results against AutoAttack.
Hi, thank you for your interest in our work!
Here, we provide the results for the OAT models (using SVHN, Simpson and Fashion) against AA on CIFAR-10. In these results, we can see that the "far" OOD datasets still improve robustness against the strong adversarial attack, though the improvements are small. | OOD | None | SVHN | Simpson | Fashion |
---|---|---|---|---|---|
Clean | 87.48 | 86.16 | 86.79 | 85.84 | |
AA | 48.29 | 49.25 | 49.24 | 48.76 |
[1] Tsipras, Dimitris, et al. "Robustness May Be at Odds with Accuracy." International Conference on Learning Representations. 2018. [2] Ilyas, Andrew, et al. "Adversarial examples are not bugs, they are features." Advances in Neural Information Processing Systems. 2019.
@Saehyung-Lee Thanks for the response!
Overall, I tend to agree with your point on the intuition of OAT's effect (on non-robust/robust features), and actually that's also why I got the same idea previously. However, the only thing that concerned me is the table results of mine shown in my first comment. I will talk about it more in the next point. Meanwhile, regarding Figure 1, I'm actually a little bit confused. I can't see what exactly the training objective of OAT+RST is. OAT should maximize the output entropy of the 500K Tiny Image data, while RST minimizes the entropy by providing the pseudo-label. Could you elaborate more on this?
Thanks for these results against AA. I'm just still curious why in my experiment I observed a gradient masking-like result while you did not. Would you mind sharing your thoughts on this? From what I can tell, there are a few differences between the setting of your Table1 and my table.
[1] Gowal, Sven, et al. "Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples." arXiv preprint arXiv:2010.03593 (2020). [2] Pang, Tianyu, et al. "Bag of tricks for adversarial training." arXiv preprint arXiv:2010.00467 (2020). [3] Jiang, Linxi, et al. "Imbalanced gradients: A new cause of overestimated adversarial robustness." arXiv preprint arXiv:2006.13726 (2020).
OK, thanks for the clarification!
Hi,
Thanks for releasing the code. I'm particularly interested in this work since I have been doing exactly the same thing recently (incorporating OOD samples to perform adversarial training). And I have a few questions about some details of your experiments.
1) Regarding the CIFAR experiments, are you using exactly the same 500K Tiny Images data from Carmon's et al.?
The reason I'm asking this is that, according to my experiments, when using Carmon's 500K Tiny Images as the auxiliary data for CIFAR10, using the uniform distribution instead of pseudo-labels as the targets still provides performance improvements than without such additional data. What I was doing here seems to correspond to your CIFAR10 experiments and align well with those CIFAR10 results in Table1 of your paper, right? However, the 500K data from Carmon are actually carefully selected to be near or in-distribution to CIFAR images, so I think the results of this experiment do not really support the claim that OOD data can actually help with robustness. In fact, it reveals something interesting: When including additional in-distribution data, even not very accurate targets can help improve robustness.
So to see whether OOD data can really help, I also tried to use Tiny ImageNet as the auxiliary data for adversarial training (compared with those carefully selected 500K Tiny Images, Tiny ImageNet should be obviously more OOD for CIFAR). The results on ResNet18 were shown below (Adv-OOD shares exactly the same form as OAT's training objective).
Interestingly, although incorporating OOD data (Tiny ImageNet) helps with PGD accuracy, it actually yields worse robustness when evaluating against the stronger AA. I haven't got a concrete explanation for this, but it seems that in this case there was some gradient masking.
2) Regarding the ImageNet10 experiments, are you picking the best performing checkpoint or just using the last checkpoint to report the results?
The reason I'm asking this is that the ImageNet10 results seem to contradict the results that I was observing in the above table (since ImageNet990 is indeed OOD for ImageNet10, and according to my results I would expect that this would not really provide significant improvement). However, as shown by [1], adversarial training suffers from overfitting, and thus the test robustness at the end of the training can be much lower than the best case (which typically occurs right after the first learning rate decay and can be achieved by simple early stopping). Thus, I'm wondering if the results for adversarial training are the best case.
Thanks in advance and look forward to having some discussions on this!
[1] Rice, Leslie, Eric Wong, and Zico Kolter. "Overfitting in adversarially robust deep learning." International Conference on Machine Learning. PMLR, 2020.