I used the below code to apply attacks on the pretrained CNN tensorflow model on the traffic sign recognition task. I want to use the extra image for classfication and build adversarial attacks on 32*32 image (which is required by the classifier model) but when I plot the adversarial examples, it is completely different from the original one.
import foolbox
from foolbox.criteria import TargetClassProbability
from foolbox.attacks import DeepFoolAttack
from foolbox.attacks import LBFGSAttack
with tf.Session() as session:
saver.restore(session, tf.train.latest_checkpoint('.'))
model = foolbox.models.TensorFlowModel(x, logits, (0, 255))
image = misc.imread('extra_signs/stop_sign.png')
label = np.argmax(model.predictions(image))
pred = sess.run(prediction, feed_dict={x: np.array([image])})
print(pred)
attack = DeepFoolAttack(model)
adversarial = attack(image, label=label)
I am new in this area, could you give me any suggestions on how to generate better adversarial examples? (From what I know, the gradient method should not change the original image this much)
I am closing this for now because it's quite old, but please reopen if it's still an issue.
In that case, could you share the original image and the adversarial?
I used the below code to apply attacks on the pretrained CNN tensorflow model on the traffic sign recognition task. I want to use the extra image for classfication and build adversarial attacks on 32*32 image (which is required by the classifier model) but when I plot the adversarial examples, it is completely different from the original one.
I am new in this area, could you give me any suggestions on how to generate better adversarial examples? (From what I know, the gradient method should not change the original image this much)