cleverhans-lab / cleverhans

An adversarial example library for constructing attacks, building defenses, and benchmarking both
MIT License
6.15k stars 1.39k forks source link

Issues with Keras trained model and the Carlin-Wagner L2 attack #307

Closed vinayprabhu closed 6 years ago

vinayprabhu commented 6 years ago

I have not been able to successfully generate adversarial examples on the MNIST dataset when the model is trained in Keras. Quite possibly I am doing something rather silly here.

The jupyter notebook highlighting this issue is here:

https://github.com/vinayprabhu/Lyapunov_defense/blob/master/MNIST_CW_Fail_Keras.ipynb

Also, cw_params = {'binary_search_steps': 15, 'y_target': adv_ys, 'max_iterations': 50000, 'learning_rate': 1, 'batch_size': source_samples * nb_classes if targeted else source_samples, 'initial_const': 0.1, 'verbose':False}

goodfeli commented 6 years ago

I haven't run the ipython notebook. What specifically goes wrong?

vinayprabhu commented 6 years ago

For the MNIST architecture and the specific image examples, (in the examples), 'Successfully generated adversarial examples on 100 of 100 instances' was a given. However, the number of 'Successfully generated adversarial examples' dramatically drops when the model is trained in Keras and imported into TF.

With, the CW-parameters being set at:

cw = CarliniWagnerL2(model, back='tf', sess=sess) cw_params = {'binary_search_steps': 15, 'y_target': adv_ys, 'max_iterations': 50000, 'learning_rate': 1, # Now trying with higher learning rate 'batch_size': source_samples * nb_classes if targeted else source_samples, 'initial_const': 0.1, 'verbose':False}

adv = cw.generate_np(adv_inputs, **cw_params),

I get:

[INFO 2017-11-03 11:24:06,239 cleverhans] Constructing new graph for attack CarliniWagnerL2 [DEBUG 2017-11-03 11:24:06,591 cleverhans] Running CWL2 attack on instance 0 of 100 [DEBUG 2017-11-03 11:24:06,607 cleverhans] Binary search step 0 of 15 [DEBUG 2017-11-03 11:24:06,656 cleverhans] Iteration 0 of 50000: loss=8.98 l2=0 f=0.1 [DEBUG 2017-11-03 11:24:35,039 cleverhans] Iteration 5000 of 50000: loss=9.78 l2=0.00807 f=0.1 [DEBUG 2017-11-03 11:24:35,039 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:24:35,041 cleverhans] Successfully generated adversarial examples on 12 of 100 instances. [DEBUG 2017-11-03 11:24:35,042 cleverhans] Mean successful distortion: 0.5183 [DEBUG 2017-11-03 11:24:35,043 cleverhans] Binary search step 1 of 15 [DEBUG 2017-11-03 11:24:35,048 cleverhans] Iteration 0 of 50000: loss=87.9 l2=0 f=0.1 [DEBUG 2017-11-03 11:25:04,936 cleverhans] Iteration 5000 of 50000: loss=89.2 l2=0.0126 f=0.1 [DEBUG 2017-11-03 11:25:04,938 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:25:04,940 cleverhans] Successfully generated adversarial examples on 13 of 100 instances. [DEBUG 2017-11-03 11:25:04,941 cleverhans] Mean successful distortion: 0.7094 [DEBUG 2017-11-03 11:25:04,942 cleverhans] Binary search step 2 of 15 [DEBUG 2017-11-03 11:25:04,947 cleverhans] Iteration 0 of 50000: loss=869 l2=0 f=0.1 [DEBUG 2017-11-03 11:25:33,665 cleverhans] Iteration 5000 of 50000: loss=945 l2=1.94 f=0.1 [DEBUG 2017-11-03 11:25:33,666 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:25:33,667 cleverhans] Successfully generated adversarial examples on 21 of 100 instances. [DEBUG 2017-11-03 11:25:33,668 cleverhans] Mean successful distortion: 1.308 [DEBUG 2017-11-03 11:25:33,669 cleverhans] Binary search step 3 of 15 [DEBUG 2017-11-03 11:25:33,674 cleverhans] Iteration 0 of 50000: loss=7.93e+03 l2=0 f=0.1 [DEBUG 2017-11-03 11:26:01,821 cleverhans] Iteration 5000 of 50000: loss=5.9e+03 l2=7.74 f=0.1 [DEBUG 2017-11-03 11:26:27,807 cleverhans] Iteration 10000 of 50000: loss=5.95e+03 l2=8.26 f=0.1 [DEBUG 2017-11-03 11:26:27,808 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:26:27,810 cleverhans] Successfully generated adversarial examples on 29 of 100 instances. [DEBUG 2017-11-03 11:26:27,810 cleverhans] Mean successful distortion: 1.986 [DEBUG 2017-11-03 11:26:27,811 cleverhans] Binary search step 4 of 15 [DEBUG 2017-11-03 11:26:27,816 cleverhans] Iteration 0 of 50000: loss=7.13e+04 l2=0 f=0.1 [DEBUG 2017-11-03 11:26:53,880 cleverhans] Iteration 5000 of 50000: loss=3.19e+04 l2=29.7 f=0.1 [DEBUG 2017-11-03 11:27:18,578 cleverhans] Iteration 10000 of 50000: loss=3.12e+04 l2=30.9 f=0.1 [DEBUG 2017-11-03 11:27:41,984 cleverhans] Iteration 15000 of 50000: loss=2.99e+04 l2=31.6 f=0.1 [DEBUG 2017-11-03 11:28:06,907 cleverhans] Iteration 20000 of 50000: loss=2.98e+04 l2=31.5 f=0.1 [DEBUG 2017-11-03 11:28:29,756 cleverhans] Iteration 25000 of 50000: loss=2.98e+04 l2=31.5 f=0.1 [DEBUG 2017-11-03 11:28:29,757 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:28:29,758 cleverhans] Successfully generated adversarial examples on 44 of 100 instances. [DEBUG 2017-11-03 11:28:29,759 cleverhans] Mean successful distortion: 3.098 [DEBUG 2017-11-03 11:28:29,760 cleverhans] Binary search step 5 of 15 [DEBUG 2017-11-03 11:28:29,766 cleverhans] Iteration 0 of 50000: loss=5.67e+05 l2=0 f=0.1 [DEBUG 2017-11-03 11:28:52,962 cleverhans] Iteration 5000 of 50000: loss=2.21e+05 l2=39 f=0.1 [DEBUG 2017-11-03 11:29:16,913 cleverhans] Iteration 10000 of 50000: loss=2.18e+05 l2=39.1 f=0.1 [DEBUG 2017-11-03 11:29:39,540 cleverhans] Iteration 15000 of 50000: loss=2.15e+05 l2=39.5 f=0.1 [DEBUG 2017-11-03 11:30:05,767 cleverhans] Iteration 20000 of 50000: loss=2.15e+05 l2=39.4 f=0.1 [DEBUG 2017-11-03 11:30:27,869 cleverhans] Iteration 25000 of 50000: loss=2.15e+05 l2=39.4 f=0.1 [DEBUG 2017-11-03 11:30:51,580 cleverhans] Iteration 30000 of 50000: loss=2.14e+05 l2=39.4 f=0.1 [DEBUG 2017-11-03 11:31:16,913 cleverhans] Iteration 35000 of 50000: loss=2.15e+05 l2=39.4 f=0.1 [DEBUG 2017-11-03 11:31:16,914 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:16,916 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:16,917 cleverhans] Mean successful distortion: 3.435 [DEBUG 2017-11-03 11:31:16,918 cleverhans] Binary search step 6 of 15 [DEBUG 2017-11-03 11:31:16,925 cleverhans] Iteration 0 of 50000: loss=5.03e+06 l2=0 f=0.1 [DEBUG 2017-11-03 11:31:16,926 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:16,928 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:16,929 cleverhans] Mean successful distortion: 3.435 [DEBUG 2017-11-03 11:31:16,930 cleverhans] Binary search step 7 of 15 [DEBUG 2017-11-03 11:31:16,936 cleverhans] Iteration 0 of 50000: loss=4.99e+07 l2=0 f=0.1 [DEBUG 2017-11-03 11:31:16,937 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:16,940 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:16,941 cleverhans] Mean successful distortion: 3.435 [DEBUG 2017-11-03 11:31:16,942 cleverhans] Binary search step 8 of 15 [DEBUG 2017-11-03 11:31:16,948 cleverhans] Iteration 0 of 50000: loss=4.99e+08 l2=0 f=0.1 [DEBUG 2017-11-03 11:31:16,949 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:16,951 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:16,952 cleverhans] Mean successful distortion: 3.435 [DEBUG 2017-11-03 11:31:16,954 cleverhans] Binary search step 9 of 15 [DEBUG 2017-11-03 11:31:16,959 cleverhans] Iteration 0 of 50000: loss=4.99e+09 l2=0 f=0.1 [DEBUG 2017-11-03 11:31:16,961 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:16,963 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:16,964 cleverhans] Mean successful distortion: 3.435 [DEBUG 2017-11-03 11:31:16,965 cleverhans] Binary search step 10 of 15 [DEBUG 2017-11-03 11:31:16,972 cleverhans] Iteration 0 of 50000: loss=4.99e+10 l2=0 f=0.1 [DEBUG 2017-11-03 11:31:16,973 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:16,975 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:16,976 cleverhans] Mean successful distortion: 3.435 [DEBUG 2017-11-03 11:31:16,978 cleverhans] Binary search step 11 of 15 [DEBUG 2017-11-03 11:31:16,984 cleverhans] Iteration 0 of 50000: loss=4.99e+11 l2=0 f=0.1 [DEBUG 2017-11-03 11:31:16,985 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:16,988 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:16,989 cleverhans] Mean successful distortion: 3.435 [DEBUG 2017-11-03 11:31:16,990 cleverhans] Binary search step 12 of 15 [DEBUG 2017-11-03 11:31:16,996 cleverhans] Iteration 0 of 50000: loss=4.99e+12 l2=0 f=0.1 [DEBUG 2017-11-03 11:31:16,997 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:16,999 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:17,000 cleverhans] Mean successful distortion: 3.435 [DEBUG 2017-11-03 11:31:17,001 cleverhans] Binary search step 13 of 15 [DEBUG 2017-11-03 11:31:17,007 cleverhans] Iteration 0 of 50000: loss=4.99e+13 l2=0 f=0.1 [DEBUG 2017-11-03 11:31:17,008 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:17,011 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:17,012 cleverhans] Mean successful distortion: 3.435 [DEBUG 2017-11-03 11:31:17,013 cleverhans] Binary search step 14 of 15 [DEBUG 2017-11-03 11:31:17,019 cleverhans] Iteration 0 of 50000: loss=4.99e+11 l2=0 f=0.1 [DEBUG 2017-11-03 11:31:17,020 cleverhans] Failed to make progress; stop early [DEBUG 2017-11-03 11:31:17,023 cleverhans] Successfully generated adversarial examples on 50 of 100 instances. [DEBUG 2017-11-03 11:31:17,023 cleverhans] Mean successful distortion: 3.435

npapernot commented 6 years ago

It seems you have commented out a line that wraps the Keras model into a CleverHans model. Let us know if this is the source of the problem.

model_wrap = KerasModelWrapper(model)
vinayprabhu commented 6 years ago

Thanks for the reply! I had the KerasModelWrapper in the code before when it threw up errors. I will update this reply with the errors.

vinayprabhu commented 6 years ago

image @npapernot I get the following tensorshape error upon using the model wrapper:

###################################################################

[INFO 2017-11-11 22:22:09,677 cleverhans] Constructing new graph for attack CarliniWagnerL2 [DEBUG 2017-11-11 22:22:09,956 cleverhans] Running CWL2 attack on instance 0 of 100 [DEBUG 2017-11-11 22:22:09,965 cleverhans] Binary search step 0 of 1

UnknownErrorTraceback (most recent call last)

in () 21 targeted else source_samples, 22 'initial_const': 10} ---> 23 adv = cw.generate_np(adv_inputs, **cw_params) /notebooks/Deep_grebox_attacks/src/cleverhans/cleverhans/attacks.pyc in generate_np(self, x_val, **kwargs) 185 feed_dict[new_kwargs[name]] = feedable[name] 186 --> 187 return self.sess.run(x_adv, feed_dict) 188 189 def get_or_guess_labels(self, x, kwargs): /usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata) 887 try: 888 result = self._run(None, fetches, feed_dict, options_ptr, --> 889 run_metadata_ptr) 890 if run_metadata: 891 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) /usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata) 1118 if final_fetches or final_targets or (handle and feed_dict_tensor): 1119 results = self._do_run(handle, final_targets, final_fetches, -> 1120 feed_dict_tensor, options, run_metadata) 1121 else: 1122 results = [] /usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 1315 if handle is None: 1316 return self._do_call(_run_fn, self._session, feeds, fetches, targets, -> 1317 options, run_metadata) 1318 else: 1319 return self._do_call(_prun_fn, self._session, handle, feeds, fetches) /usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args) 1334 except KeyError: 1335 pass -> 1336 raise type(e)(node_def, op, message) 1337 1338 def _extend_graph(self): UnknownError: InvalidArgumentError: Incompatible shapes: [100,10] vs. [100,128] [[Node: mul_15 = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](sub_9, model_1/flatten_1/Reshape)]]