fra31 / auto-attack

Code relative to "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks"
https://arxiv.org/abs/2003.01690
MIT License
656 stars 112 forks source link

Slow evaluation on a dataset with a large number of classes in TF1.x setting #18

Closed halo8218 closed 4 years ago

halo8218 commented 4 years ago

Thank you for releasing the evaluation code. AA attack evaluation is very helpful in studying the adversarial robustness, especially on Cifar10 benchmark. However, when I tried to evaluate my model(trained on cifar100, TF1.13), it took too much time to run the code. The longer the running time is, the longer the gpu utilization is maintained at 0%. How can I solve this problem?

fra31 commented 4 years ago

Hi,

which version of AA are you using? The initial one contained the untargeted version of FAB, which requires the Jacobian matrix to be built and computed which is very expensive for datasets with many classes (and also the implementation of the targeted version had the same issue). We then restructured a bit AA, so that now it contains only FAB-T (the targeted version) with an improved implementation (see also the README.md for details). In our experiments on CIFAR-100 and ImageNet we run it against the top-9 classes (after excluding the correct one) predicted by the classifier for the clean point. We use the same strategy for APGD-DLR (targeted version with 9 classes). The number of classes can be set e.g. here.

In this way, the runtime on CIFAR-100 should be roughly equivalent to that on CIFAR-10, since the input dimension is the same.

In general, in order to use TF models, one has to convert the PT tensors to numpy (on the CPU) and back for every call of the classifier. This might be a reason for the drops in the GPU utilization. However, as mentioned above, this effect should be mitigated by not generating and computing the full Jacobian (the improvement in runtime should be significant even on CIFAR-10).

Hope this helps! If not, please let me know and I'll investigate this issue further.