Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
MIT License
4.87k stars 1.17k forks source link

PixelDefend uses function that is not implemented #1183

Closed Ahmed-Hamoda closed 3 years ago

Ahmed-Hamoda commented 3 years ago

In pixel_defend.py, the __call__ method uses a function from estimator.py that is called get_activations. The issue is that get_activations raises a "not implemented error" and hence causes PixelDefend to be unusable. Is there another method that gets the activations of a specified layer or would that need to be implemented by the user? Thanks.

beat-buesser commented 3 years ago

Hi @Ahmed-Hamoda Thank you very much for exploring ART! Could you please tell us more about how you are using these tools? Maybe in a Google Colab notebook?

Ahmed-Hamoda commented 3 years ago

Hi @beat-buesser, Im using PixelDefend to generate purified samples from adversarial samples. Here is the script I'm using (I took the example file in the ART library and changed it slightly):

from __future__ import absolute_import, division, print_function, unicode_literals

from datetime import datetime
import os
from matplotlib import pyplot as plt

import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPool2D
import numpy as np

from art.attacks.evasion import *
from art.defences.preprocessor import *
from art.estimators.classification import TensorFlowV2Classifier
from art.utils import load_mnist

loss_object = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)

def main():
    # Read MNIST dataset (x_raw contains the original images): (step 1)
    (x_raw, y_raw), (x_raw_test, y_raw_test), min_pixel_value, max_pixel_value = load_mnist()

    # Select 5000 images for training
    n_train = np.shape(x_raw)[0]
    num_selection = 5000
    random_selection_indices = np.random.choice(n_train, num_selection)
    x_raw = x_raw[random_selection_indices]
    y_raw = y_raw[random_selection_indices]

    # Select 500 images for testing
    n_train = np.shape(x_raw_test)[0]
    num_selection = 500
    random_selection_indices = np.random.choice(n_train, num_selection)
    x_raw_test = x_raw_test[random_selection_indices]
    y_raw_test = y_raw_test[random_selection_indices]

    (x_train, y_train) = (x_raw, y_raw)
    (x_test, y_test) = (x_raw_test, y_raw_test)

    # Create Model (step 2)
    class TensorFlowModel(Model):
        """
        Standard TensorFlow model for unit testing.
        """

        def __init__(self):
            super(TensorFlowModel, self).__init__()
            self.conv1 = Conv2D(filters=4, kernel_size=5, activation="relu")
            self.conv2 = Conv2D(filters=10, kernel_size=5, activation="relu")
            self.maxpool = MaxPool2D(pool_size=(2, 2), strides=(2, 2), padding="valid", data_format=None)
            self.flatten = Flatten()
            self.dense1 = Dense(100, activation="relu")
            self.logits = Dense(10, activation="linear")

        def call(self, x):
            """
            Call function to evaluate the model.
            :param x: Input to the model
            :return: Prediction of the model
            """
            x = self.conv1(x)
            x = self.maxpool(x)
            x = self.conv2(x)
            x = self.maxpool(x)
            x = self.flatten(x)
            x = self.dense1(x)
            x = self.logits(x)
            return x

    model = TensorFlowModel()

    # Create ART Classifier (step 3)

    classifier = TensorFlowV2Classifier(
        model=model,
        loss_object=loss_object,
        train_step=train_step,
        nb_classes=10,
        input_shape=np.shape(x_train[0]),
        clip_values=(0, 1),
    )

    # Train ART classifier (step 4)
    classifier.fit(x_train, y_train, nb_epochs=10, batch_size=128)

    # 1. Generating the attacks
    # 2. Evaluating the classifier on the attacks and test set using the defence
    # 3. Predicting the accuracy and storing data

    gen_defense = PixelDefend(eps=16,
                              clip_values=(0.0, 1.0),
                              verbose=True,
                              pixel_cnn=classifier)

    FGSM = FastGradientMethod(classifier)
    FGSM_x = FGSM.generate(x=x_raw_test)
    (FGSM_x, _) = gen_defense(x=FGSM_x)

    preds = np.argmax(classifier.predict(FGSM_x), axis=1)
    acc = np.sum(preds == np.argmax(y_test, axis=1)) / len(y_test)
    print("\nFGSM accuracy: %.2f%%" % (acc * 100))

def train_step(model, images, labels):
    with tf.GradientTape() as tape:
        predictions = model(images, training=True)
        loss = loss_object(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

if __name__ == "__main__":
    main()

The issue is that when I run this script, everything is fine up until the (FGSM_x, _) = gen_defense(x=FGSM_x) line, where I get this error:

File "...\art\defences\preprocessor\pixel_defend.py", line 102, in __call__
    raise ValueError("Activations are None.")
ValueError: Activations are None.

I also tried to leave the pixel_cnn argument blank when I initialized the PixelDefend instance, but then I get this error:

File "...\art\defences\preprocessor\pixel_defend.py", line 154, in _check_params
    raise TypeError("PixelCNN model must be of type Classifier.")
TypeError: PixelCNN model must be of type Classifier.

If you look at the pixel_defend.py file that is mentioned in the error messages, you'll find this block of code:

        if self.pixel_cnn is not None:
            activations = self.pixel_cnn.get_activations(x, layer=-1, batch_size=self.batch_size)
            if activations is not None:
                probs = activations.reshape((x.shape[0], -1, 256))
            else:
                raise ValueError("Activations are None.")
        else:
            raise ValueError("No model received for `pixel_cnn`.")

If you then go to the self.pixel_cnn.get_activations() line, the get_activations method (which is found in estimator.py) is not implemented:

@abstractmethod
    def get_activations(
        self, x: np.ndarray, layer: Union[int, str], batch_size: int, framework: bool = False
    ) -> np.ndarray:
        """
        Return the output of a specific layer for samples `x` where `layer` is the index of the layer between 0 and
        `nb_layers - 1 or the name of the layer. The number of layers can be determined by counting the results
        returned by calling `layer_names`.

        :param x: Samples
        :param layer: Index or name of the layer.
        :param batch_size: Batch size.
        :param framework: If true, return the intermediate tensor representation of the activation.
        :return: The output of `layer`, where the first dimension is the batch size corresponding to `x`.
        """
        raise NotImplementedError

I think that is what is causing the error, because the activations will always be None. If you need any more information or if you have a fix please let me know. Thanks

beat-buesser commented 3 years ago

Hi @Ahmed-Hamoda I think you are almost there. The last code segment in your previous message shows the abstract definition of get_activations which itself does not get interpreted. The issue is caused because TensorFlowV2Classifier.get_activations currently only supports models of type tf.keras.Sequential and otherwise for ART 1.6 and earlier it returns None, from ART 1.7 it raises an exception with explanation if the model is not of type tf.keras.Sequential.

Could you please try to convert your model class TensorFlowModel from tensorflow.keras.Model to tf.keras.Sequential?

Ahmed-Hamoda commented 3 years ago

That fixed the error, thanks a lot!