bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.73k stars 422 forks source link

Why the successful rate of attacking example of ImagetNet datasets is so low with DeepFoolAttack? #68

Closed shenqixiaojiang closed 7 years ago

shenqixiaojiang commented 7 years ago

Just like the title "the successful rate of attacking example of ImagetNet datasets is so low with DeepFoolAttack" which is tested with the following code:

#get the dict of label
ff = open("./ILSVRC2015/ilsvrc_2012_val.txt")
mm = {}
for i in ff:
       pt = i.strip().split(' ')
       mm[pt[0]] = int(pt[1])
ff.close()

keras.backend.set_learning_phase(0)
kmodel = ResNet50(weights='imagenet')
preprocessing = (numpy.array([104, 116, 123]), 1)

fmodel = foolbox.models.KerasModel(kmodel, bounds=(0, 255), preprocessing=preprocessing)
attack  = foolbox.attacks.DeepFoolAttack(fmodel)

valpath = r'./ILSVRC2015/rgbVal/'   #the original image 
imList = os.listdir(valpath)[:40]
width = 224
adv = np.zeros((len(imList),width,width,3))  #for saving the adv images
src = np.zeros((len(imList),width,width,3))   #for saving the original images
srcLabel = np.zeros(len(imList))    #for saving the label 
for j in range(len(imList)):
      image = Image.open(valpath + imList[j])
      image = image.resize((width,width))
      image = np.asarray(image, dtype="float32")
      label = mm[imList[j]]  #get the label
      srcLabel[j] = label
      src[j] = image
      ans = attack(image[:,:,::-1], label)
      if ans == None:
          adv[j] = image
      else:
          adv[j] = ans[:,:,::-1]
      x = np.expand_dims(adv[j], axis=0)
      x = preprocess_input(x)
      preds = kmodel.predict(x)
      print('pre-label,',preds.argmax(),label)
num_classes = 1000
y_test = keras.utils.to_categorical(srcLabel, num_classes)
kmodel.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])
score = kmodel.evaluate(src, y_test, verbose=0)  #get the accuracy of original images
print('Test accuracy:', score[1])
score = kmodel.evaluate(adv, y_test, verbose=0)  #get the accuracy of adv images
print('Test accuracy:', score[1])

Finally, the accuracy of original images has no difference with the adv images.

wielandbrendel commented 7 years ago

Your evaluation is wrong: you have to apply the same image preprocessing that you passed to the Foolbox model. In your example above you are not applying any preprocessing when you evaluate.

shenqixiaojiang commented 7 years ago

@wielandbrendel Now, I use the following code to get the attack examples:

adv = np.zeros((len(imList),width,width,3))  #for saving the adv images
src = np.zeros((len(imList),width,width,3))   #for saving the original images
srcLabel = np.zeros(len(imList))    #for saving the label 
for j in range(len(imList)):
      image = Image.open(valpath + imList[j])
      image = image.resize((width,width))
      image = np.asarray(image, dtype="float32")
      label = mm[imList[j]]  #get the label
      srcLabel[j] = label
      src[j] = image
      ans = attack(image[:,:,::-1], label)
      if ans == None:
          adv[j] = image
      else:
          adv[j] = ans[:,:,::-1]

np.save("srcFoolbox",src)
np.save("advFoolbox",adv)
np.save("srcLabel",srcLabel)

and use the following code to test the accuracy:

keras.backend.set_learning_phase(0)
kmodel = ResNet50(weights='imagenet')

srcLabel = np.load('srcLabel.npy')
srcdata = np.load('srcFoolbox.npy')
advdata = np.load('advFoolbox.npy')

for i in range(2):
    right = 0
    for j in range(len(srcdata)):
        if i == 0:
            image = srcdata[j]
        else:
            image = advdata[j]
        label = srcLabel[j]
        x = np.expand_dims(image, axis=0)
        x = preprocess_input(x)
        preds = kmodel.predict(x)  #the code of predicting is got from the example code of [Keras](https://keras.io/applications/#mobilenet)
        print('pre-label,',preds.argmax(),label)
        if preds.argmax() == label:
            right += 1
    print right * 1.0 / len(srcdata)

And we get the accuracy of original images is 0.675 and the adv images is 0.575.

wielandbrendel commented 7 years ago

Could you count how many images DeepFool thinks it found an adversarial for? I.e. please add something like

      if ans == None:
          failures += 1
          adv[j] = image
      else:
          successes += 1
          adv[j] = ans[:,:,::-1]

and report the result (failures, successes).

shenqixiaojiang commented 7 years ago

@wielandbrendel Here, 40 test images of ImageNet are used to get the attack examples, and the result of (failures, successes) is (5,35). The images of Cifar10 and Mnist were tested and it was normal.

wielandbrendel commented 7 years ago

Please check that your preprocessing is the same, i.e. that preprocess_input(x) yields the same as fmodel._process_input(x).