Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
MIT License
4.79k stars 1.16k forks source link

Generating adversarial examples from a restored model causes examples generation to run endlessly #1224

Closed Abhishek2271 closed 3 years ago

Abhishek2271 commented 3 years ago

Hi,

This is not exactly a feature request but a question. Maybe it is already implemented. Is there any way to debug or to see what is happening when adversarial examples are being generated. For instance,

attack = FastGradientMethod(estimator=classifier, eps=0.2)
x_test_adv = attack.generate(x=p_test)

In the above code to generate adversarial examples using FGSM, I restored a custom model trained to classify SVHN data and created an ART classifier from it. But when I run the attack. generate, the code runs endlessly. I reduced the p_test data to contain only 1 sample but it still did not work. Is there any way I can see what is happening?

Again, I am sorry if this is not exactly a bug or feature request. I do not know where to ask it.

beat-buesser commented 3 years ago

Hi @Abhishek2271 Thank you very much for using ART! Most of the attacks have a verbose option to show progress during running generate, but not FastGradientMethod as it runs usually very fast. Is your CPU/GPU busy when you run your code? When you stop the run, what does the stack-trace show?

Abhishek2271 commented 3 years ago

HI @beat-buesser, thank you for the reply. When I checked, the CPU usages does spike-up

image

Also, I checked with a simple classifier. predict:

predictions = classifier.predict(p_test)

here, classifier is an ART TensorFlowClassifier. And this also runs continuously.

what does the stack-trace show?

I think it is blank. How do I view the stack-trace? I meant that the output is blank with the code running.

beat-buesser commented 3 years ago

@Abhishek2271 Usually the terminal where you run the program will contain a stack-trace of where the execution was at the moment when you interrupted the running program, this could give an indication i which line your program is at that time or if it has started or not, etc.

I see you are using Windows. Could you please describe how you are running the Python script? Would you be able to share the complete script? It would be interesting to see the code before the attack.

Abhishek2271 commented 3 years ago

Hi @beat-buesser,

Thank you for your interest and time on the issue. I am using Jupiter notebook to run python scripts; although, I also tried running via .py files. Doing this also did not show anything on the terminal.

The script is as below:

import tensorflow as tf
print(tf.__version__)
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from tensorflow.keras.losses import categorical_crossentropy
from tensorflow.keras.utils import to_categorical

from sklearn.model_selection import train_test_split

import tensorflow.keras
import tensorflow.keras.layers as layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.image import ImageDataGenerator
#from tensorflow.keras.utils.np_utils import to_categorical
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.layers import Dense, Flatten, Conv2D, AveragePooling2D

import datetime
from time import time

import scipy.io
import os
import math
import tensorflow.contrib.eager as eager

#begin to import data
eager.enable_eager_execution()
def load_data(path):
    """ Helper function for loading a MAT-File"""
    data = scipy.io.loadmat(path)
    return data['X'], data['y']
x_test, y_test = load_data(r'c:\tmp\SVHN\test_32x32.mat')
x_test = x_test.astype('float32')
#normalize the input
x_test /= 255
#reshape labels to array
y_test.reshape((-1))
#Rearrang default dim of img from (HWCB) to (BHWC) 
x_test = np.transpose(x_test, (3, 0, 1, 2))
#select only 10 input for now
p_test = x_test[ :10]
#resize inp to 40:40 to match input signature
p_test = tf.image.resize(p_test, [40,40]).numpy()
#print(p_test.shape)
y_test[y_test == 10] = 0
tf.compat.v1.disable_eager_execution()
print(y_test.shape)
#select only 10 output labels for now
q_test= y_test[:10]
print(q_test.shape)

#train art classifier
from art.attacks.evasion import FastGradientMethod
from art.estimators.classification import TensorFlowClassifier
graph = tf.Graph()
with graph.as_default():
    with tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)) as sess:
        global s
        s=sess
        sess.run(tf.global_variables_initializer())
        #_ = tf.Variable(initial_value='fake_variable')
        new_saver = tf.train.import_meta_graph(r"C:\Users\sab\Downloads\AI Testing\Source\Dorefanet\tensorpack\examples\DoReFa-Net\train_log\svhn-dorefa-1,2,4\graph-0707-160817.meta")
        new_saver.restore(s, tf.train.latest_checkpoint(r"C:\Users\sab\Downloads\AI Testing\Source\Dorefanet\tensorpack\examples\DoReFa-Net\train_log\svhn-dorefa-1,2,4"))    
        logits = tf.get_collection("logits")[0]   
        #x = graph.get_tensor_by_name("conv0/W:0")
        #var_23 = [v for v in tf.global_variables() if v.name == "conv0/W:0"][0]
        tf.initialize_all_variables().run()
        #print(x.eval())
        #print(logits) 
        input_ph = tf.placeholder(tf.float32, shape=[None, 40, 40, 3])
        labels_ph = tf.placeholder(tf.int32, shape=[None, 10])
        loss = tf.reduce_mean(tf.losses.softmax_cross_entropy(logits=logits, onehot_labels=labels_ph))
        optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
        train= tf.train.exponential_decay(
                learning_rate=1e-3,
                global_step=2,
                decay_steps=4721 * 100,
                decay_rate=0.5, staircase=True, name='learning_rate')
        min_= 0.0
        max_ =  1.0
        classifier = TensorFlowClassifier(
            clip_values=(min_, max_),
            input_ph=input_ph,
            output=logits,
            labels_ph=labels_ph,
            train=train,
            loss=loss,
            learning=None,
            sess=sess,
            preprocessing_defences=[],
            )         
        classifier.fit(p_test, q_test, batch_size=1, nb_epochs=1)
        predictions_org = classifier.predict(p_test)

The model that I restored has the following graph:

def build_graph(self, image, label):
    fw, fa, fg = get_dorefa(BITW, BITA, BITG)

    # monkey-patch tf.get_variable to apply fw
    def binarize_weight(v):
        name = v.op.name
        # don't binarize first and last layer
        if not name.endswith('W') or 'conv0' in name or 'fc' in name:
            return v
        else:
            logger.info("Binarizing weight {}".format(v.op.name))
            return fw(v)

    def nonlin(x):
        if BITA == 32:
            return tf.nn.relu(x)
        return tf.clip_by_value(x, 0.0, 1.0)

    def activate(x):
        return fa(nonlin(x))

    image = image / 256.0

    with remap_variables(binarize_weight), \
            argscope(BatchNorm, momentum=0.9, epsilon=1e-4), \
            argscope(Conv2D, use_bias=False):
        logits = (LinearWrap(image)
                  .Conv2D('conv0', 48, 5, padding='VALID', use_bias=True)
                  .MaxPooling('pool0', 2, padding='SAME')
                  .apply(activate)
                  # 18
                  .Conv2D('conv1', 64, 3, padding='SAME')
                  .apply(fg)
                  .BatchNorm('bn1').apply(activate)

                  .Conv2D('conv2', 64, 3, padding='SAME')
                  .apply(fg)
                  .BatchNorm('bn2')
                  .MaxPooling('pool1', 2, padding='SAME')
                  .apply(activate)
                  # 9
                  .Conv2D('conv3', 128, 3, padding='VALID')
                  .apply(fg)
                  .BatchNorm('bn3').apply(activate)
                  # 7

                  .Conv2D('conv4', 128, 3, padding='SAME')
                  .apply(fg)
                  .BatchNorm('bn4').apply(activate)

                  .Conv2D('conv5', 128, 3, padding='VALID')
                  .apply(fg)
                  .BatchNorm('bn5').apply(activate)
                  # 5
                  .Dropout(rate=0.5 if self.training else 0.0)
                  .Conv2D('conv6', 512, 5, padding='VALID')
                  .apply(fg).BatchNorm('bn6')
                  .apply(nonlin)
                  .FullyConnected('fc1', 10)())
    tf.nn.softmax(logits, name='output')
    tf.add_to_collection("logits", logits)

I used the "Logits" here as an input to the classifier after restoring.

Abhishek2271 commented 3 years ago

svhn-dorefa-32,32,32.zip Attached is the model I used. The dataset I used is SVHN test data set from http://ufldl.stanford.edu/housenumbers/ dataset is: test_32x32.mat

beat-buesser commented 3 years ago

Hi @Abhishek2271 Thank you for the details, I will take a look if I can run it.

Btw, we started using the new Discussions feature (the tab next to Pull Requests) and we move general question over there until an issue arises.