Slow evaluation on a dataset with a large number of classes in TF1.x setting

Hi,

which version of AA are you using? The initial one contained the untargeted version of FAB, which requires the Jacobian matrix to be built and computed which is very expensive for datasets with many classes (and also the implementation of the targeted version had the same issue). We then restructured a bit AA, so that now it contains only FAB-T (the targeted version) with an improved implementation (see also the README.md for details). In our experiments on CIFAR-100 and ImageNet we run it against the top-9 classes (after excluding the correct one) predicted by the classifier for the clean point. We use the same strategy for APGD-DLR (targeted version with 9 classes). The number of classes can be set e.g. here.

In this way, the runtime on CIFAR-100 should be roughly equivalent to that on CIFAR-10, since the input dimension is the same.

In general, in order to use TF models, one has to convert the PT tensors to numpy (on the CPU) and back for every call of the classifier. This might be a reason for the drops in the GPU utilization. However, as mentioned above, this effect should be mitigated by not generating and computing the full Jacobian (the improvement in runtime should be significant even on CIFAR-10).

Hope this helps! If not, please let me know and I'll investigate this issue further.

fra31 / auto-attack

Slow evaluation on a dataset with a large number of classes in TF1.x setting #18