Closed JoshuaCN closed 4 years ago
Hi @JoshuaCN , thank you for using ART and raising this issue! It seems there is a problem with vanishing gradients: when I compute, with the TensorflowV2Classifier,
clg = classifier.loss_gradient(x_test[k][None,...], np.eye(1,10,y_test[k]))
for a test input k
on which the attack fails, it turns out the entries of clg
are all 0.0
. Thus, FGSM (which uses the loss gradients to compute the adversarial example) will not alter the input and fail. In fact, the test inputs for which this occurs are all classified with class probabilities very close to 1.0
, leading to the (almost) vanishing gradients.
On the other side, when computing the loss gradients via the KerasClassifier, they are non-zero (albeit very small, <1e-10
).
We will check whether those numerical differences are due to implementation differences on the ART side, or to differences in numerical precision provided by the two frameworks.
Hi @JoshuaCN Thank you very much for using ART and raising this issue! I would be very interested to reproduce and learn more about it.
Could you please let us know which Keras did you use (keras or tensorflow.keras, and which version) and which release candidate version of TensorFlow 2.2.0 did you use?
I was not able to open the Colab link above, can you check if it is working?
Hi @beat-buesser Thank you for your reply and sorry for my delay, I used tensorflow.keras, and the release candidate version is rc2.
Directly clicking the link Ieads to a blank page, I instead copy the colab link and open it in a new page, which works for me.
Anyway, I attached the file here in case the link is invalid for others.
Thank you very much! This is very helpful.
I have ran experiments with your notebook and I think the difference in behaviour observed is caused by TensorFlow and not by ART. KerasClassifier
uses tensorflow.keras.backend.gradients
to calculate loss gradients whereas TensorFlowV2Classifier
uses tf.GradientTape
. It looks like that the softmax activation in the last layer of the model can be treated numerically different in these two methods to calculate gradients. This seems in agreement with currently ongoing discussions on GitHub (https://github.com/tensorflow/tensorflow/issues/32895#issuecomment-614813600 and https://github.com/tensorflow/tensorflow/issues/35585#issuecomment-615413203).
To test this, I have changed the Colab notebook to run with a model predicting logits instead of probabilities and the much earlier appearing vanishing gradients in TensorFlowV2Classifier
have disappeared. Both classifiers are now resulting in similarly strong adversarial examples and accuracy.
Thank you so much! Now I'm getting expected results.
To test this, I have changed the Colab notebook to run with a model predicting logits instead of probabilities and the much earlier appearing vanishing gradients in
TensorFlowV2Classifier
have disappeared. Both classifiers are now resulting in similarly strong adversarial examples and accuracy.
@beat-buesser Do you have an example of the fixed Colab notebook?
@ed1d1a8d , would you have a link to the colab notebook you referenced above?
Describe the bug Hi, I recently attacked an MNIST model with FGSM, when KerasClassifer is used, the attack works properly (acc=16.91%), however, I noticed a high adversarial accuracy (77.6%) and lots of unchanged images when using TensorFlowV2Classifier. I have read a relevant issue #279 ,but it didn't help.
To Reproduce Here is the colab notebook to reproduce the results. https://colab.research.google.com/drive/1ZHEzLy3SRdaZflYOImVFOlWnH8GbWF_v
Keras Classifier is loaded by the following line,
classifer = KerasClassifier(model, clip_values=(0, 1))
Tensorflow Classifier is loaded by the following line,
classifier = TensorFlowV2Classifier(model, 10, input_shape, clip_values=(0, 1),loss_object=tf.losses.SparseCategoricalCrossentropy())
Attack Function is initialised by the following line,
FGM_params = { 'eps': 0.3, 'norm': np.inf, 'batch_size': 200, 'num_random_init': 0 }
adversary = FastGradientMethod(classifier, **FGM_params)
x_adv = adversary.generate(x_test)
Expected behavior I expected the attack performs similarly across different platforms.
Screenshots
![KerasClassifier](https://user-images.githubusercontent.com/37404028/79119560-38027000-7dc3-11ea-8e67-046d8b5a2207.png)
System information: