Closed sayakpaul closed 3 years ago
Hi @sayakpaul. Yes, the number of classes used by DeepFool is by default 10, which was also used for reporting the results in the paper.
Okay. Thanks. I have a couple more questions:
compute_margin_distribution()
? I did use nn.DataParallel but I believe because of the sequential nature of this block nn.DataParallel might not be sufficient. Unfortunately, we do not have a nn.DataParallel
implementation as we could not run these experiments in multiple GPUs. However, it should be relatively easy to modify the code to support this feature if you recode DeepFool using a batched implementation. You can get some inspiration from the Foolbox and ART implementations.
Regarding the ImageNet experiments, we do not remember the exact timings, but for the settings in the paper (1,000 samples and a few tens of subspaces, it was less than a day for each network using a single Titan X). In practice, we could not observe any difference in the margin trends when we varied the number of evaluation samples, and hence we never saw the need to run the margin computation in such a large scale.
Is it by default 10? What value was used to report the paper results?