Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
MIT License
4.84k stars 1.16k forks source link

Boundary attack overflows on Xception network and stops working #436

Closed maliwin closed 4 years ago

maliwin commented 4 years ago

Hi, I'm attempting to execute a boundary attack on the Xception network, but even after simplifying my code to be almost identical to the attack_decision_based_boundary notebook, I still cannot get it to work properly.

Here's the code:

import numpy as np
import tensorflow as tf

from PIL import Image
from matplotlib import pyplot as plt
from art.classifiers import TensorFlowV2Classifier
from art.attacks.evasion import BoundaryAttack
from tensorflow.keras.applications.xception import Xception, decode_predictions

model = Xception(weights='imagenet')
art_model = TensorFlowV2Classifier(model=model, nb_classes=1000, input_shape=(299, 299, 3), clip_values=(0, 255),
                                   preprocessing=(127.5, 127.5))
attack = BoundaryAttack(classifier=art_model, targeted=False, max_iter=0, delta=0.001, epsilon=0.001)

target_image = Image.open('dragonfly.jpg')
target_image = target_image.resize((299, 299), resample=Image.LANCZOS)
target_image = np.array(target_image, dtype=np.float64)

print('class id: ' + str(np.argmax(art_model.predict(np.array([target_image])))))
print(decode_predictions(art_model.predict(np.array([target_image]))))  # correctly classified as dragonfly

iter_step = 200
x_adv = None
x_advs = []
predictions = []

for i in range(20):
    x_adv = attack.generate(x=np.array([target_image]), x_adv_init=x_adv)
    prediction = decode_predictions(model.predict(x_adv))
    x_advs.append(x_adv)
    predictions.append(prediction)

    print("Adversarial image at step %d." % (i * iter_step), "L2 error",
          np.linalg.norm(np.reshape(x_adv[0] - target_image, [-1])),
          "and class label %d." % np.argmax(art_model.predict(x_adv)[0]))

    plt.imshow(x_adv[0].astype(np.uint))
    plt.show(blocking=False)

    if hasattr(attack, 'curr_delta') and hasattr(attack, 'curr_epsilon'):
        attack.max_iter = iter_step
        attack.delta = attack.curr_delta
        attack.epsilon = attack.curr_epsilon
    else:
        break
Link to dragonfly.jpg https://user-images.githubusercontent.com/15788686/83576103-4eaa8500-a531-11ea-9819-2683429bc961.jpg

Here is the output after several iterations:

class id: 319
[[('n02268443', 'dragonfly', 0.983584), ('n02268853', 'damselfly', 0.010473117), ('n02219486', 'ant', 0.00042001816), ('n02264363', 'lacewing', 0.000288288), ('n02231487', 'walking_stick', 0.00014885762)]]
Adversarial image at step 0. L2 error 52361.16511237516 and class label 111.
Adversarial image at step 200. L2 error 42468.074777978414 and class label 111.
...\Python38\lib\site-packages\art\attacks\evasion\boundary.py:317: RuntimeWarning: overflow encountered in multiply
  perturb *= delta * np.linalg.norm(original_sample - current_sample)
...\Python38\lib\site-packages\art\attacks\evasion\boundary.py:327: RuntimeWarning: invalid value encountered in subtract
  perturb[i] -= np.dot(np.dot(perturb[i], direction[i].T), direction[i])
...\Python38\lib\site-packages\art\attacks\evasion\boundary.py:327: RuntimeWarning: overflow encountered in subtract
  perturb[i] -= np.dot(np.dot(perturb[i], direction[i].T), direction[i])
Adversarial image at step 400. L2 error nan and class label 111.
Adversarial image at step 600. L2 error nan and class label 111.
Adversarial image at step 800. L2 error nan and class label 111.
Adversarial image at step 1000. L2 error nan and class label 111.
Adversarial image at step 1200. L2 error nan and class label 111.

Effectively it NaNs out and doesn't recover from it. The image at iteration 200 is similar to the one in notebook (half noise - half target image), but everything after that is stuck at something like 80% noise - 20% target image.

The preprocessing is set to (127.5, 127.5) since from my understanding, Xception expects inputs to be in range [-1, 1]. Changing the delta and epsilon parameters in the attack didn't seem to change much.

Here is what it looks like after 200 iterations: image And here is what it looks like after it reaches NaN (and it stays this way): image

Any help or advice would be appreciated. Thank you. :)

beat-buesser commented 4 years ago

Hi @maliwin Thank you very much for using ART and providing such a detailed description of the issue! We are very interest to find out the cause for this issue and will investigate it as soon as possible.

@minhitbk Have you seen this behaviour of BoundaryAttack before?

minhitbk commented 4 years ago

@maliwin : An updated version of Boundary attack is available in branch dev_1.3.0, which will be merged to master soon. This version fixed your issue. I ran your code and got the result: class id: 319 [[('n02268443', 'dragonfly', 0.9835841), ('n02268853', 'damselfly', 0.0104731135), ('n02219486', 'ant', 0.00042001702), ('n02264363', 'lacewing', 0.00028828802), ('n02231487', 'walking_stick', 0.00014885736)]] Adversarial image at step 0. L2 error 52361.318116537186 and class label 111. Adversarial image at step 200. L2 error 12077.601012107383 and class label 111. Adversarial image at step 400. L2 error 9279.618072131077 and class label 111. Adversarial image at step 600. L2 error 7235.303173900557 and class label 111. Adversarial image at step 800. L2 error 5683.6825038876505 and class label 111.

maliwin commented 4 years ago

Thank you both for the quick response. I guess this issue can be closed then. :)