gives an adversarial example with a classification probability of 83%
But then, in the exact same code, only raising the criteria target class probability to 0.5 causes the attack to fail (while it should theoretically work for any values under 0.83 at least).
This issue was partially explained by the fact that the predictions given to the criterion need to be the pre-softmax output values of the neural network. (see #158)
The
TargetClassProbability
criterion seems to behave stragely in some cases: for example, withFashionMNIST
:gives an adversarial example with a classification probability of 83%
But then, in the exact same code, only raising the criteria target class probability to 0.5 causes the attack to fail (while it should theoretically work for any values under 0.83 at least).
Is this expected ?