bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.76k stars 426 forks source link

Attack poorly on 32*32 image #284

Closed theodore3131 closed 5 years ago

theodore3131 commented 5 years ago

I used the below code to apply attacks on the pretrained CNN tensorflow model on the traffic sign recognition task. I want to use the extra image for classfication and build adversarial attacks on 32*32 image (which is required by the classifier model) but when I plot the adversarial examples, it is completely different from the original one.

import foolbox
from foolbox.criteria import TargetClassProbability
from foolbox.attacks import DeepFoolAttack
from foolbox.attacks import LBFGSAttack

with tf.Session() as session:
    saver.restore(session, tf.train.latest_checkpoint('.'))

    model = foolbox.models.TensorFlowModel(x, logits, (0, 255))
    image = misc.imread('extra_signs/stop_sign.png')

    label = np.argmax(model.predictions(image))
    pred = sess.run(prediction, feed_dict={x: np.array([image])})
    print(pred)
    attack = DeepFoolAttack(model)
    adversarial = attack(image, label=label)

I am new in this area, could you give me any suggestions on how to generate better adversarial examples? (From what I know, the gradient method should not change the original image this much)

jonasrauber commented 5 years ago

I am closing this for now because it's quite old, but please reopen if it's still an issue. In that case, could you share the original image and the adversarial?