bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.77k stars 426 forks source link

Implementing attacks on ImageNet samples Keras ResNet #260

Closed hechtlinger closed 5 years ago

hechtlinger commented 5 years ago

I'm having problems implementing attacks on ResNet50. The documentation example works, but as I try to run it on the imagenet samples from foolbox.utils.samples it stops working. Unlike the example, the samples doesn't need the RGB -> BGR transformation,which might be what cause the problem with the preprocessing step.

Here is a short code showing the problem. The problem is happening with all the pictures, also when loading independently from imagenet validation, therefore likely related to the way the pictures are loaded into the model.

import foolbox
import keras
import numpy as np
from keras.applications.resnet50 import ResNet50
from keras.applications.resnet50 import preprocess_input
from keras.applications.imagenet_utils import decode_predictions

# instantiate model
keras.backend.set_learning_phase(0)
kmodel = ResNet50(weights='imagenet')

# Load Images
images, labels = foolbox.utils.samples(batchsize=20)

# Confrim order
images_preprocess = preprocess_input(images.copy())
model_predict = kmodel.predict(images_preprocess)
print('Accuracy without rotation: %.2f'%(np.mean(model_predict.argmax(1) == labels)))

images_preprocess = preprocess_input(images[...,::-1].copy())
model_predict = kmodel.predict(images_preprocess)
print('Accuracy with rotation: %.2f'%(np.mean(model_predict.argmax(1) == labels)))

# FGSM Attack
preprocessing = (np.array([104, 116, 123]), 1)
fmodel = foolbox.models.KerasModel(kmodel, bounds=(0, 255), preprocessing=preprocessing)

ix = 1
image = images[ix]
label = labels[ix]

FGSM_attack = foolbox.attacks.FGSM(fmodel)
FGSM_adversarial = FGSM_attack(image, label)

# Check Results
print('Regular Image Prediction:')
print(decode_predictions(kmodel.predict(preprocess_input(image[np.newaxis].copy())))[0])

print('Adversarial Image Prediction:')
print(decode_predictions(kmodel.predict(preprocess_input(FGSM_adversarial[np.newaxis].copy())))[0])
hechtlinger commented 5 years ago

It seems that one way to solve the issue is just to define the preprocessing transformation to be the identity and feed in the attack the input post preprocessing, adjusting the bounds accordingly. I'd close the issue, but it'll be great if there is more elegant solution.

images_preprocess = preprocess_input(images.copy())

preprocessing = (np.array([0, 0, 0]), 1)

fmodel = foolbox.models.KerasModel(kmodel, bounds=(images_preprocess.min(), images_preprocess.max()), preprocessing=preprocessing)

ix = 0
image = images_preprocess[ix]
label = labels[ix]

FGSM_attack = foolbox.attacks.FGSM(fmodel)
FGSM_adversarial = FGSM_attack(image, label)