bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.73k stars 422 forks source link

same code generating different outputs #246

Closed wjadvos closed 5 years ago

wjadvos commented 5 years ago

When I run the same code multiple times, I get different values for the L2 norm of the adverserial attack (using deepfoolattack (L2 norm). However, when I do this in a for loop, I get the same values. I looked in the source code how this randomness is generated, but I couldn't find it. Can anyone explain please?

wielandbrendel commented 5 years ago

There is no randomness in the DeepFool attack. Could you post a minimum example and the exact values?

wjadvos commented 5 years ago

Ow yeah, now I see.. I'm sorry, there is a mistake in my code, where the randomness could come from. I am looking for a minimal adverserial example (with smallest possible L2 norm). To what degree can I assume that the norm of the attacks in the foolbox corresponds to the smallest possible perturbation to flip the classification (for example for the deepfoolattack)?

wielandbrendel commented 5 years ago

Basically all attacks in Foolbox are written such that the minimal perturbation possible under this attack is returned.

wjadvos commented 5 years ago

So if I want the minimal possible attack adverserial attack tout-cours (to state it precise: the input with the smallest L2 difference from the original input that changes the classification), I should run the different algorhytms then take the minimum?

wielandbrendel commented 5 years ago

Exactly! Check out our recent paper that demonstrates how I personally believe a good robustness evaluation should look like: https://arxiv.org/abs/1805.09190

wjadvos commented 5 years ago

Thanks a lot! but I should mention: I am not looking to evaluate the robustness of the the model, but rather the robustness of a single classification of the model (which put higher demands on the output of these algorhytms to be close to the smallest adverserial perturbation)

wielandbrendel commented 5 years ago

The same reasoning holds ;-). Good luck!