bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.77k stars 426 forks source link

Apply attacking but the acc is till high #258

Closed Marvinmw closed 5 years ago

Marvinmw commented 5 years ago

Hi, I use some attacking method but the val acc of data is still high. I feel confused. I compare my codes with the tutorial examples. But I cannot find any reason. I use cifar10.

#mode is trained well with 92.3% acc
#images are randomly choosed from X_test.
#ground_truth is the true labels for the selected images

fgsmx = generateAdversialExample(model, images, ground_truth,  foolbox.attacks.FGSM)
dfx = generateAdversialExample(model, images, ground_truth,  foolbox.attacks.DeepFoolL2Attack)
def generateAdversialExample(model, images, label, attackMethod = foolbox.attacks.FGSM):
    fmodel = foolbox.models.KerasModel(model, bounds=(0,255.))
    attack = attackMethod(fmodel)
    adversial = np.zeros(images.shape)
    for i in np.arange(len(images)):
        print("Generate adversial Image {}".format(i))
            adversial[i] = attack(images[i], label[i][0])
    return adversial
wielandbrendel commented 5 years ago

All images returned by the attack are adversarial - everyone of it. If the attack could not find an adversarial it will return None. Hence, I am not sure what you mean with the validation accuracy still being high.

Marvinmw commented 5 years ago

I get the adverarial images X _adv of X. And I compute val acc for both of them respectively. Val acc of X_avd is almost the same with val acc of X.

Marvinmw commented 5 years ago

Anyway, after I come home, I will try it again.

wielandbrendel commented 5 years ago

Probably the way you are testing the images is different then. All images coming out of the attack should have a flipped label if evaluated with fmodel.predictions(adversarial).

Marvinmw commented 5 years ago

I find the problem. It is caused that I build 2 same models with the different name and load the same weights. Then the result is wrong. If I delete one, the result is good. It is weird. Actually, the other model did nothing.

wielandbrendel commented 5 years ago

Great to hear that you could solve the problem!