Open ines21 opened 3 years ago
If I am not mistaken, this does not only pertain to CW, correct?
This should definitely be mentioned in the documentation, and it might even be worthwhile to include a heuristic check when initializing an fmodel
to ensure that the final layer does not resemble softmax. I remember doing something similar (in a different context) when comparing several pretrained models to ensure all gave me logits.
I agree that the documentation should be more concreate in that regard. I suppose it makes sense to overhaul the documentation, soon. We can collect these ideas in #654.
yep @jangop that is what I ended up implementing myself. It is been a long time since I was working on this project, but if I remember correctly lots of other attacks were working well for softmax models. C&W was the one for which this was crucial.
Running the Carlini and Wagner attack, I was having less success than the paper stated. I noticed that the implementation in Foolbox was using the final normalised predictions instead of the unnormalised logits, which makes the attack less effective than it is supposed to be (especially against defensive distillation).
This might be the task of the person running the attack to pass a logits model, but it is still worth mentioning maybe in the documentation?