Closed shashankkotyan closed 4 years ago
Hi @shashankkotyan Thank you very much for using ART and raising this issue! To get a better understanding, could you please provide additional information about your model (structure, loss function, etc.)?
Thank you for your prompt reply @beat-buesser.
I have used vanilla ResNet architecture whose weights I load from a model weights (.h5) file. In both the experiments, Keras and TensorflowV2, I use the same .h5 file. The only difference is using Keras Module in the first experiment and TensorflowV2 Module in the second experiment.
Code Snippet:
In Keras experiment
from keras import initializers, layers, models, optimizers, regularizers
In TensorflowV2 experiment
from tensorflow.keras import initializers, layers, models, optimizers, regularizers
img_rows, img_cols, img_channels = 32,32,3
stack_n = 5
weight_decay = 0.0001
optimizer = optimizers.SGD(lr=.1, momentum=0.9, nesterov=True)
def residual_block(img_input, out_channel, increase=False):
if increase: stride = (2,2)
else: stride = (1,1)
x = img_input
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(out_channel,kernel_size=(3,3),strides=stride,padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(x)
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(out_channel,kernel_size=(3,3),strides=(1,1),padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(x)
if increase:
projection = layers.Conv2D(out_channel, kernel_size=(1,1), strides=(2,2), padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(img_input)
return layers.add([x, projection])
else: return layers.add([img_input, x])
img_input = layers.Input(shape=(img_rows, img_cols, img_channels))
x = layers.Conv2D(filters=16,kernel_size=(3,3),strides=(1,1),padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(img_input)
for _ in range(stack_n): x = residual_block(x, 16, False)
x = residual_block(x, 32, True)
for _ in range(1, stack_n): x = residual_block(x, 32, False)
x = residual_block(x, 64, True)
for _ in range(1, stack_n): x = residual_block(x, 64, False)
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(10, name='Output', activation='softmax', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(x)
model = models.Model(img_input, x)
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
model.load_weights(f"model_weights.h5")
@shashankkotyan I have used your code above to train a model with Keras 2.3.1 and have tested it after loading from h5-file with KerasClassifier
and TensorFlowV2Classifer
with TensorFlow 2.1. I have received identical adversarial examples in both cases using FastGradientMethod
. I have also checked that the loss gradients provided by the two classifiers are identical. So far I'm not able to reproduce your observation.
Can you confirm that you are using the same h5-file for both of your experiments (it could have been overwritten)? Could you provide a single example script that returns the difference that you have observed?
I recommend setting clip_values=(0,1)
or clip_values=(0,255)
for the ART classifiers, this makes sure that the pixels of the adversarial example stay in the valid range.
@beat-buesser I have checked the h5 file and it remains the same across my tests.
It has an accuracy of 92.7% on Cifar-10 Test Dataset given by the sample script below.
I have also checked the setting of clip values. In my case as I give a preprocessed image to the attacked, therefore, I use clip_values=(0,1)
A summary of adversarial accuracy on the first 100 samples is
With Clip Value | Without Clip Value | |
---|---|---|
Keras Version | 47/100 | 59/100 |
Tensorflow V2 Version | 9/100 | 26/100 |
If possible, can you also check the adversarial accuracy across multiple images?
Also, just to mention I use two environments to reproduce, One has Tensorflow 2.1.0, Other has Tensorflow 1.13.1 and Keras 2.2.3
The h5 file I use for the tests. model_weights.zip
Code Example to Reproduce
import os, sys
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
tf.get_logger().setLevel("ERROR")
# ! Change of These Parameters !
keras_opt = True
clip_opt = False
fname = ""
if keras_opt: fname = f"{fname}KerasVersion"
else: fname = f"{fname}TensorflowVersion"
if clip_opt: fname = f"{fname} With ClipValues"
else: fname = f"{fname} Without ClipValues"
if keras_opt:
from keras import datasets, initializers, layers, models, optimizers, regularizers
else:
from tensorflow.keras import datasets, initializers, layers, models, optimizers, regularizers
import numpy as np
num_images = {'train': 50000, 'test': 10000}
num_classes = 10
dataset_name = 'Cifar10'
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
mean = [125.307, 122.95, 113.865]
std = [62.9932, 62.0887, 66.7048]
(raw_x_train, raw_y_train), (raw_x_test, raw_y_test) = datasets.cifar10.load_data()
raw_y_train, raw_y_test = raw_y_train[:,0], raw_y_test[:,0]
def color_preprocess(imgs):
if imgs.ndim < 4: imgs = np.array([imgs])
imgs = imgs.astype('float32')
for i in range(3): imgs[:,:,:,i] = (imgs[:,:,:,i] - mean[i]) / std[i]
return imgs
def color_postprocess(imgs):
if imgs.ndim < 4: imgs = np.array([imgs])
imgs = imgs.astype('float32')
for i in range(3): imgs[:,:,:,i] = (imgs[:,:,:,i] * std[i]) + mean[i]
return imgs.astype(int)
img_rows, img_cols, img_channels = 32,32,3
stack_n = 5
weight_decay = 0.0001
optimizer = optimizers.SGD(lr=.1, momentum=0.9, nesterov=True)
def residual_block(img_input, out_channel, increase=False):
if increase: stride = (2,2)
else: stride = (1,1)
x = img_input
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(out_channel,kernel_size=(3,3),strides=stride,padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(x)
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(out_channel,kernel_size=(3,3),strides=(1,1),padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(x)
if increase:
projection = layers.Conv2D(out_channel, kernel_size=(1,1), strides=(2,2), padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(img_input)
return layers.add([x, projection])
else: return layers.add([img_input, x])
img_input = layers.Input(shape=(img_rows, img_cols, img_channels))
x = layers.Conv2D(filters=16,kernel_size=(3,3),strides=(1,1),padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(img_input)
for _ in range(stack_n): x = residual_block(x, 16, False)
x = residual_block(x, 32, True)
for _ in range(1, stack_n): x = residual_block(x, 32, False)
x = residual_block(x, 64, True)
for _ in range(1, stack_n): x = residual_block(x, 64, False)
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(num_classes, name='Output', activation='softmax', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(x)
model = models.Model(img_input, x)
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
model.load_weights(f"model_weights.h5")
pred = np.argmax(model.predict(color_preprocess(raw_x_test)), axis=1)
# Accuracy of the model remains 0.927 across the tests which confirms that the h5 file is not overwritten.
print(f"{np.sum(pred==raw_y_test)/len(raw_y_test):.3f}")
from art import attacks, classifiers
if keras_opt:
if clip_opt:
classifier = classifiers.KerasClassifier(model, use_logits=False, channel_index=3, clip_values=(0,1), defences=None, preprocessing=(0, 1), input_layer=0, output_layer=0)
else:
classifier = classifiers.KerasClassifier(model, use_logits=False, channel_index=3, clip_values=None, defences=None, preprocessing=(0, 1), input_layer=0, output_layer=0)
else:
if clip_opt:
classifier = classifiers.TensorFlowV2Classifier(model, num_classes, (img_rows, img_cols, img_channels), loss_object=tf.keras.losses.CategoricalCrossentropy(), train_step=None, channel_index=3, clip_values=(0,1), defences=None, preprocessing=(0, 1))
else:
classifier = classifiers.TensorFlowV2Classifier(model, num_classes, (img_rows, img_cols, img_channels), loss_object=tf.keras.losses.CategoricalCrossentropy(), train_step=None, channel_index=3, clip_values=None, defences=None, preprocessing=(0, 1))
attacker = attacks.evasion.FastGradientMethod(classifier=classifier, norm=np.inf, targeted=False, eps=0.3)
def attack(x,y):
adv_x = color_postprocess(attacker.generate(color_preprocess(x)))[0]
prior_probs = model.predict(color_preprocess(x))[0]
predicted_probs = model.predict(color_preprocess(adv_x))[0]
actual_class = y # np.argmax(prior_probs)
predicted_class = np.argmax(predicted_probs)
success = predicted_class != actual_class
return adv_x, success
samples = 100
adv_xs = []
succeses = []
for x, y in zip(raw_x_test[:samples], raw_y_test[:samples]):
adv_x, success = attack(x, y)
adv_xs += [adv_x]
succeses += [success]
grid = np.sqrt(samples).astype(int)
original = raw_x_test[:samples]
original = original.reshape(grid, grid, img_rows, img_cols, img_channels).swapaxes(1, 2).reshape(grid*img_rows, grid*img_cols, img_channels)
adversarial = np.array(adv_xs)
adversarial =adversarial.reshape(grid, grid, img_rows, img_cols, img_channels).swapaxes(1, 2).reshape(grid*img_rows, grid*img_cols, img_channels)
indices = np.where(np.array(succeses) == True)[0]
from matplotlib import pyplot as plt
fig = plt.figure(1, figsize=(20,10), dpi=300)
(ax1, ax2) = fig.subplots(1,2)
ax1.imshow(original.astype(int))
ax2.imshow(adversarial.astype(int))
ax1.set_xticks([]); ax2.set_xticks([])
ax1.set_yticks([]); ax2.set_yticks([])
ax1.set_xlabel("Original Images")
ax2.set_xlabel(f"Adversarial Images {indices}")
fig.tight_layout()
fig.savefig(f"{fname} Adversarial Accuracy {np.sum(succeses)} out of {samples}", bbox_inches="tight", dpi=300)
Keras Version With Clip Values (Adv Acc 47/100)
Keras Version Without Clip Values (Adv Acc 59/100)
Tensorflow Version With Clip Values (Adv Acc 9/100)
Tensorflow Version With Clip Values (Adv Acc 26/100)
Edited Issue Comment to include more specific details about the testing environments.
@shashankkotyan Thank you very much for the great example script! Sorry for the delay, but I think I have finally identified the reasons for your observations.
The current version of TensorFlowV2Classifier
in ART 1.1.0 only supports SparseCrossEntropy
because it calls the loss function with index labels. Unfortunately it does not warn or inform the user. This got fixed in commit 46eeb2ff0ec4aec6d07c6b799611b36fc84768bb which is already on branch dev_1.2.0
and will be published in ART 1.2.0 in a few weeks.
If I change the two loss function definitions in your script to use SparseCrossEntropy
I observe identical success rates with ART v1.1.0 for all combinations reported above. Small variations in the success rates in the order of 1% can sometimes be observed, which could be caused by the numerics of different implementations in the external frameworks:
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
and
... , loss_object=tf.keras.losses.SparseCategoricalCrossentropy(), ...
in the lines creating TensorFlowV2Classifier
classifiers.
A few things that I have changed in your script:
attack.generate
and classifier.predict
, most likely in [0, 1] or [0, 255] range and define the clip_values
to (0,1) or (0,255) respectivelycolor_preprocess
define the argument preprocessing
in the classifiers with a tuple of (mean, std). ART uses these values internally to scale the gradients correctly and evaluate the model in certain attacks. mean
and std
can be sequences or arrays which will be broadcasted onto the input/image data.eps
scales with the pixel range of the raw images, eps=0.1
in range [0, 1] is the same as eps=25.5
in range [0, 255]This is your script with modifications that I have used for my experiments, please let me know if you can repeat the experiments, I hope it works:
import os, sys
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
tf.get_logger().setLevel("ERROR")
# ! Change of These Parameters !
keras_opt = True
clip_opt = True
clip_values = (0, 255)
fname = ""
if keras_opt: fname = f"{fname}KerasVersion"
else: fname = f"{fname}TensorflowVersion"
if clip_opt: fname = f"{fname} With ClipValues"
else: fname = f"{fname} Without ClipValues"
if keras_opt:
from keras import datasets, initializers, layers, models, optimizers, regularizers, utils, backend, __version__
else:
from tensorflow.keras import datasets, initializers, layers, models, optimizers, regularizers, utils, backend, __version__
import numpy as np
num_images = {'train': 50000, 'test': 10000}
num_classes = 10
dataset_name = 'Cifar10'
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
mean = [125.307, 122.95, 113.865]
std = [62.9932, 62.0887, 66.7048]
(raw_x_train, raw_y_train), (raw_x_test, raw_y_test) = datasets.cifar10.load_data()
raw_x_train = raw_x_train.astype('float32')
raw_x_test = raw_x_test.astype('float32')
img_rows, img_cols, img_channels = 32,32,3
stack_n = 5
weight_decay = 0.0001
optimizer = optimizers.SGD(lr=.1, momentum=0.9, nesterov=True)
def residual_block(img_input, out_channel, increase=False):
if increase: stride = (2,2)
else: stride = (1,1)
x = img_input
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(out_channel,kernel_size=(3,3),strides=stride,padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(x)
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(out_channel,kernel_size=(3,3),strides=(1,1),padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(x)
if increase:
projection = layers.Conv2D(out_channel, kernel_size=(1,1), strides=(2,2), padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(img_input)
return layers.add([x, projection])
else: return layers.add([img_input, x])
img_input = layers.Input(shape=(img_rows, img_cols, img_channels))
x = layers.Conv2D(filters=16,kernel_size=(3,3),strides=(1,1),padding='same', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(img_input)
for _ in range(stack_n): x = residual_block(x, 16, False)
x = residual_block(x, 32, True)
for _ in range(1, stack_n): x = residual_block(x, 32, False)
x = residual_block(x, 64, True)
for _ in range(1, stack_n): x = residual_block(x, 64, False)
x = layers.BatchNormalization()(x)
x = layers.Activation('relu')(x)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(10, name='Output', activation='softmax', kernel_initializer=initializers.he_normal(), kernel_regularizer=regularizers.l2(weight_decay))(x)
model = models.Model(img_input, x)
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
model.load_weights(f"model_weights.h5")
from art import attacks, classifiers
if keras_opt:
if clip_opt:
classifier = classifiers.KerasClassifier(model, use_logits=False, channel_index=3, clip_values=clip_values, defences=None, preprocessing=(mean, std), input_layer=0, output_layer=0)
else:
classifier = classifiers.KerasClassifier(model, use_logits=False, channel_index=3, clip_values=None, defences=None, preprocessing=(mean, std), input_layer=0, output_layer=0)
else:
if clip_opt:
classifier = classifiers.TensorFlowV2Classifier(model, num_classes, (img_rows, img_cols, img_channels), loss_object=tf.keras.losses.SparseCategoricalCrossentropy(), train_step=None, channel_index=3, clip_values=clip_values, defences=None, preprocessing=(mean, std))
else:
classifier = classifiers.TensorFlowV2Classifier(model, num_classes, (img_rows, img_cols, img_channels), loss_object=tf.keras.losses.SparseCategoricalCrossentropy(), train_step=None, channel_index=3, clip_values=None, defences=None, preprocessing=(mean, std))
pred = np.argmax(classifier.predict(raw_x_test), axis=1)
# Accuracy of the model reamins 0.927 across the tests which confirms that the h5 file is not overwritten.
print(f"{np.sum(pred==raw_y_test[:,0])/len(raw_y_test):.3f}")
attacker = attacks.evasion.FastGradientMethod(classifier=classifier, norm=np.inf, targeted=False, eps=2)
def attack(x,y):
x = np.expand_dims(x, axis=0)
adv_x = attacker.generate(x)
prior_probs = classifier.predict(x)[0]
predicted_probs = classifier.predict(adv_x.astype(np.float32))[0]
actual_class = y # np.argmax(prior_probs)
predicted_class = np.argmax(predicted_probs)
success = predicted_class != actual_class
return adv_x, success
samples = 100
adv_xs = []
succeses = []
for x, y in zip(raw_x_test[:samples], raw_y_test[:samples]):
adv_x, success = attack(x, y)
adv_xs += [adv_x]
succeses += [success]
print(np.sum(succeses))
@beat-buesser Thank you for your thorough explanation. I have checked your script and it is producing expected results. Thank you for your suggestions to make the script more crisp and concise.
A summary of adversarial accuracy on the first 100 samples on the modified script is
With Clip Value | Without Clip Value | |
---|---|---|
Keras Version | 51/100 | 51/100 |
Tensorflow V2 Version | 53/100 | 53/100 |
I agree there could be small variations across platforms but as they were huge in the earlier script, therefore I opened up this issue.
I would recommend you to mention the implementation of only SparseCategoricalCrossentropy
for the TensorflowV2 module in the current documentation until ART 1.2.0 is released as it is not mentioned (or maybe I have missed it).
@shashankkotyan Thank you very much for confirming the results and your suggestions!
@shashankkotyan Thank you for your help! This should now be fixed with the release of ART 1.1.1.
Bug Description I notice a change in adversarial accuracy and adversarial image when I shift from KerasClassifier to TensorFlowV2Classifier.
Results I have for Fast Gradient Sign Method. I attacked the first 100 images in the Cifar-10 Test Dataset. The values in [*] show number of images attacked for each class KC: [ 6,6,4,8,4,8,12,10,9,10 ] = 77 images successfully attacked out of 100 images. TF: [ 1,2,2,2,3,4,0,4,4,4 ] = 26 images successfully attacked out of 100 images.
To Reproduce Keras Classifier is loaded the by the following line,
classifier = art.classifiers.KerasClassifier(model)
Tensorflow Classifier is loaded by the following line,
classifier = art.classifiers.TensorFlowV2Classifier(model, 10, (32, 32, 3), loss_object= tf.keras.losses.CategoricalCrossentropy())
Attack Function is initialised by the following line,
attack = art.attacks.FastGradientMethod(classifier=classifier)
Expected behaviour I expected the adversarial accuracy and adversarial images generated by the attacks to be same but they differ across different platforms.
Screenshots Keras Adversarial Image
Tesnorflow Adversarial Image
System information For Keras Test on Python 3.6.10
System information For TensorflowV2 Test on Python 3.7.4
All the tests were performed on Ubuntu 18.04.3 LTS (bionic).