Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
MIT License
4.82k stars 1.16k forks source link

Thermometer encoding usage #450

Closed maliwin closed 4 years ago

maliwin commented 4 years ago

Hi, I'm attempting to use the thermometer encoding preprocessing defence, but I'm confused about the usage procedure of it and I didn't see an example/notebook with that particular defence.

Here's a minimal example:

import numpy as np
from PIL import Image
from tensorflow.keras.applications.xception import Xception
from art.classifiers import TensorFlowV2Classifier
from art.defences.preprocessor import ThermometerEncoding

target_image = Image.open('dragonfly.jpg')
target_image = target_image.resize((299, 299), resample=Image.LANCZOS)
target_image = np.array(target_image, dtype=np.float64)

model = Xception(weights='imagenet')
target_image = target_image / 255
defence = ThermometerEncoding(clip_values=(0, 1))
art_model = TensorFlowV2Classifier(model=model, nb_classes=1000, input_shape=(299, 299, 3), clip_values=(0, 1),
                                   preprocessing=(0.5, 0.5), preprocessing_defences=defence)
art_model.predict(np.array([target_image]))

Any image will do in place of dragonfly.jpg, so I'm not attaching it here. I would expect to still be able to use the wrapped model normally, i.e. like this art_model.predict(np.array([target_image])) but instead I get the following traceback:

WARNING:tensorflow:Model was constructed with shape (None, 299, 299, 3) for input Tensor("input_1:0", shape=(None, 299, 299, 3), dtype=float32), but it was called on an input with incompatible shape (1, 299, 299, 30).
Traceback (most recent call last):
  File "...\Python\Python38\lib\contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "...\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\base_layer_utils.py", line 443, in enter
    yield
  File "...\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 968, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "...\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\network.py", line 717, in call
    return self._run_internal_graph(
  File "...\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\network.py", line 888, in _run_internal_graph
    output_tensors = layer(computed_tensors, **kwargs)
  File "...\Python\Python38\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 968, in __call__
    outputs = self.call(cast_inputs, *args, **kwargs)
  File "...\Python\Python38\lib\site-packages\tensorflow\python\keras\layers\convolutional.py", line 195, in call
    self._convolution_op = nn_ops.Convolution(
  File "...\Python\Python38\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 1060, in __init__
    raise ValueError(
ValueError: number of input channels does not match corresponding dimension of filter, 30 != 3

Please note that I am using the dev version currently, which I switched to due to #436, but I have also tested it with 1.2.0 and get the same output.

maliwin commented 4 years ago

Oh, after taking a closer look, I assume the model has to have one of the dimensions in the input shape of size (color channels num_space), i.e. the input shape here would have to be (299, 299, 3 num_space), where num_space is the parameter in the thermometer encoder. Is this correct?

beat-buesser commented 4 years ago

@maliwin Thank you for raising this question! Yes, ThermometerEncoding is a defence that has to be applied during fit and predict, which means you have to first train your model on encoded data because the encoding changes the image shape (https://openreview.net/forum?id=S18Su--CW, section 3).

maliwin commented 4 years ago

Thank you again @beat-buesser for such a quick response. I guess I should've paid more attention to what the defence actually did before opening up an issue, sorry! 😅

Consider the question answered.

beat-buesser commented 4 years ago

@maliwin No worries! Please open a new issue whenever you think there is something strange with ART, we are very interested to learn about it. For this issue for example we see that we should add a better error message that explains that the provided model is not compatible with the defence. I made a note to fix this.

maliwin commented 4 years ago

@beat-buesser I guess since this is technically still on the topic of thermometer encoding usage, I can ask here instead of elsewhere. I continued trying to implement a working thermometer encoding example. Next I wanted to test to see how effective an attack is against it, in this case FGSM, since from my understanding FGSM (and other white-box attacks which rely on gradients) should perform badly against this defence.

Here is a minimal example I have set up for the purpose of my question:

import tensorflow as tf

from art.attacks.evasion import FastGradientMethod
from art.utils import to_categorical
from art.classifiers import TensorFlowV2Classifier
from art.defences.preprocessor import ThermometerEncoding

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train, x_test = x_train / 255, x_test / 255  # (0, 1) range

num_space = 5
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), input_shape=(32, 32, 3 * num_space), activation='relu', padding='same'),
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
    tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
    tf.keras.layers.Dropout(0.25),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
    tf.keras.layers.Dropout(0.25),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dropout(0.50),
    tf.keras.layers.Dense(10)
])
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])

loss_object = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
optimizer = tf.keras.optimizers.Adam()

# @tf.function
def train_step(model, images, labels):
    with tf.GradientTape() as tape:
        predictions = model(images, training=True)
        loss = loss_object(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

defence = ThermometerEncoding(clip_values=(0, 1), num_space=num_space)
art_model = TensorFlowV2Classifier(probability_model, nb_classes=10, input_shape=(32, 32, 3), clip_values=(0, 1),
                                   preprocessing_defences=defence, train_step=train_step,
                                   loss_object=loss_object)

art_model.fit(x_train[:300], to_categorical(y_train[:300], 10), nb_epochs=1)

attack = FastGradientMethod(art_model)
attack.generate(x_test[:1])

I've called fit with a reduced training set and only for one epoch just for testing purposes, as it takes quite a long time to train the network on my machine (around ~1 hour for the whole 50k training set per epoch).

This is the traceback I get:

  File ".../thermometer_minimal_example.py", line 50, in <module>
    attack.generate(x_test[:1])
  File "...\Python\Python38\lib\site-packages\art\attacks\attack.py", line 70, in replacement_function
    return fdict[func_name](self, *args, **kwargs)
  File "...\Python\Python38\lib\site-packages\art\attacks\evasion\fast_gradient.py", line 211, in generate
    adv_x = self._compute(x, x, y, mask, self.eps, self.eps, self._project, self.num_random_init > 0)
  File "...\Python\Python38\lib\site-packages\art\attacks\evasion\fast_gradient.py", line 361, in _compute
    perturbation = self._compute_perturbation(batch, batch_labels, mask_batch)
  File "...\Python\Python38\lib\site-packages\art\attacks\evasion\fast_gradient.py", line 316, in _compute_perturbation
    assert batch.shape == grad.shape
AssertionError

The batch shape is (1, 32, 32, 3), whereas the grad shape is (1, 32, 32, 15). I'm sure I'm missing something trivial/obvious again, but can't figure it out easily. 😅

beat-buesser commented 4 years ago

Hi @maliwin I think your script is correct and I am able to reproduce the error. On first look, I think that the implementations of ThermometerEncoding and FastGradientMethod are currently not compatible because ThermometerEncoding.estimate_gradient is returning gradients with a the shape of the discretized/encoded input, whereas FastGradientMethod is expecting gradients for the original/unmodified input. I think we need to take a closer look if we need to update ThermometerEncoding.estimate_gradient.

maliwin commented 4 years ago

@beat-buesser I guess that equally affects PGD then since it internally uses the FGM perturbation calculation. Unfortunately that means the results from https://openreview.net/pdf?id=S18Su--CW arent easy to reproduce here, since for their white-box attacks they use PGD and FGSM.

mathsinn commented 4 years ago

Hi @maliwin , we have identified and fixed two bugs in the thermometer encoding, please check out PR #467. Using the fix with the example that you provided seems to work - could you also try please?

maliwin commented 4 years ago

Hi @mathsinn, thank you for providing a fix this quickly. I currently have 3 models trained without the fix (for different epochs). I took a look at the fix and it seems like those models will not be compatible with the new version anymore.

I will train 3 new models and compare them to the 3 old ones, and I will see if it fixes the original problem of using thermometer encoding with FGSM/PGD attacks.

maliwin commented 4 years ago

@mathsinn It does seem to work now with both FGSM and PGD.