cihangxie / NIPS2017_adv_challenge_defense

Mitigating Adversarial Effects Through Randomization
MIT License
118 stars 20 forks source link

The defense success rate for CW attacks is very low. #5

Closed shudong-zhang closed 5 years ago

shudong-zhang commented 5 years ago

Hello, I used your code to test the CW attack method of the inception-v3 model with an accuracy rate of only 60.46%. Using 5,000 images from the ImageNet training set, the accuracy of these clean images in inception-v3 is 100%. After CW attack, the accuracy rate was reduced to 0.00%. I don't know if there is a problem with my test data set.

cihangxie commented 5 years ago

Could you list the parameter setting in your C&W attacks?

BTW, why you use images from training set?

shudong-zhang commented 5 years ago

This is my C&W attack parameter. cw_params = { "binary_search_steps":5, "confidence":0.9, "max_iterations":100, "learning_rate":0.1, "batch_size":FLAGS.batch_size, "initial_const":10, "abort_early":True, "clip_min":-1., "clip_max":1. }

To be honest, I don't know why I use the training set to test. Will the test results of the training set be very different from the verification set?

shudong-zhang commented 5 years ago

Similar results were obtained using the 5000 validation set images. Can you share your cw attack parameters with me?

cihangxie commented 5 years ago

Sure. My attack parameters are:

cw_params = {'binary_search_steps': 3, 'abort_early' : True, 'max_iterations': 250, 'learning_rate': 0.001, 'batch_size': FLAGS.batch_size, 'initial_const': 100, 'nb_classes': num_classes, 'confidence': 0, 'clip_min': 0.0, 'clip_max': 1.0 }

Please let me know if you can get similar results

shudong-zhang commented 5 years ago

thank you for your sharing. The accuracy after randomization is 95.58%.

cihangxie commented 5 years ago

Glad to see you solve this issue. I guess the main reason is that your original script set confidence = 0.9, while my script set confidence = 0.

To defend against such strong adversarial attacks, I would recommend our recent work: https://github.com/facebookresearch/ImageNet-Adversarial-Training