Closed kotleta2007 closed 3 years ago
Hi @kotleta2007 Thank you very much for using ART!
Are you referring to the argument epsilon
of DeepFool
? Can you share the code section where you define the attack?
Hello!
Indeed, I was referring to the epsilon
argument.
In the following code, I attack a simple fully connected MNIST classifier. We can see that even though the epsilon
parameter was set to 0.2
, all samples have a much higher distance from the original than that:
import tensorflow as tf from art.estimators.classification import TensorFlowV2Classifier import numpy as np from art.attacks.evasion import DeepFool
mnist = tf.keras.datasets.mnist (X_train, y_train), (X_test, y_test) = mnist.load_data() X_train, X_test = X_train / 255.0, X_test / 255.0
model = tf.keras.Sequential([ tf.keras.layers.Flatten(input_shape=(28,28)), tf.keras.layers.Dense(100, activation='relu'), tf.keras.layers.Dense(100, activation='relu'), tf.keras.layers.Dense(10, activation='softmax') ])
model.compile( loss='SparseCategoricalCrossentropy', optimizer='adam', metrics=['accuracy'] )
model.fit(X_train, y_train, epochs=10) model.evaluate(X_test, y_test)
ART_classifier = TensorFlowV2Classifier( model=model, nb_classes=10, input_shape=(28, 28), loss_object=tf.keras.losses.SparseCategoricalCrossentropy(), clip_values=(0, 1) )
attack = DeepFool(classifier=ART_classifier, epsilon=0.2) SAMPLE_SIZE = 10 X_test_adv = attack.generate(X_test[:SAMPLE_SIZE])
_, adv_accuracy = model.evaluate(X_test_adv, y_test[:SAMPLE_SIZE]) print('Accuracy on adversarial test data: {:4.2f}%'.format(adv_accuracy * 100))
for i in range(SAMPLE_SIZE): print("l2 distance from original: {}".format(np.linalg.norm(X_test_adv[i] - X_test[i], ord=2)))
Output:
l2 distance from original: 7.5198902445117515 l2 distance from original: 12.517949078098 l2 distance from original: 15.583265832646825 l2 distance from original: 10.309606071366971 l2 distance from original: 12.67709685435253 l2 distance from original: 14.319209476402303 l2 distance from original: 14.030292651152768 l2 distance from original: 11.192603835883371 l2 distance from original: 13.267377609462349 l2 distance from original: 9.74294408168186
Hi @kotleta2007
Thank you for the example code. I have noticed two items:
The argument epsilon
of DeepFool
does not define the maximum permitted perturbation and therefore cannot be compared directly to the L2 norm of the observed perturbation. The epsilon
corresponds to the parameter eta
of Moosavi-Dezfooli et al. which defines how much the adversarial example should be pushed across the boundary instead of having it exactly on or too close to the boundary.
Your model is using a softmax
activation in its last layer. DeepFool
expects a model to output logits to achieve its best attack performance. Therefore you should change softmax
to linear
.
I have tried generating some adversarial samples with DeepFool and setting the
eps
parameter to 0.2, however the attack samples had a distance far greater than 0.2 from the original samples.Is it possible to use another parameter or modify the value of
eps
to ensure that the l_2 norm ofadversarial - original
is less than a certain threshold?Thank you in advance.