Closed ricvo closed 6 years ago
Do you have a minimum working code example?
A common cause of this problem is passing probabilities where the attack expects logits. Functions used to compute probabilities, such as the softmax and sigmoid function, often have strong saturation.
It is also possible in theory for the cross-entropy function itself to saturate and maybe that's what you've encountered, but I haven't observed that being a problem in practice yet.
Thanks goodfeli for the quick answer. I tried to pass logits instead of probs as you suggested, specifying the keyword 'logits' to the CallableModelWrapper. Unfortunately this did not solve the issue...
We trained a network of 784-200-10 with 100 epochs on MNIST training set. I also attach the weights if you want to give it a try... (W0.npy, b0.npy, W1.npy, b1.npy). X_test is the test set of MNIST.
[CODE] W0_np = np.load("experiments/W0.npy") b0_np = np.load("experiments/b0.npy") W1_np = np.load("experiments/W1.npy") b1_np = np.load("experiments/b1.npy")
W0 = tf.Variable(W0_np, dtype = tf.floatXX) b0 = tf.Variable(b0_np, dtype = tf.floatXX) W1 = tf.Variable(W1_np, dtype = tf.floatXX) b1 = tf.Variable(b1_np, dtype = tf.floatXX)
sess = tf.Session() sess.run(tf.global_variables_initializer())
def nn(x): x = tf.cast(x, tf.floatXX) h = tf.matmul(x, W0) + b0 h = tf.nn.relu(h) logits = tf.matmul(h, W1) + b1
return logits
nn_model = CallableModelWrapper(nn, 'logits') fgsm = FastGradientMethod(nn_model, 'tf', sess) fgsm_params = {'eps' : 0.1, 'ord' : np.inf, 'clip_min' : 0, 'clip_max' : 1} fgsm_adv_test = fgsm.generate_np(X_test, **fgsm_params) [/CODE]
We generated 3 sets of adversarial examples: fgsm_ch_float32.npy, fgsm_ch_float64.npy, fgsm_tf_float64.npy. These are respectively the adversarial examples generated with cleverhans using float32 and float64 in the code above, and the last ones are adversarial examples generated using a code written in tensorflow with float64 everywhere.
We found performances to be quite different, both the examples generated with cleverhans show around 48% accuracy and the number of images that are not modified at all is around 45%. In the case of the examples generated directly in tf with float64 the accuracy drops to around 3% and all images are modified by fgsm.
I suspect that this is due to the fact that the cleverhans attacks use all float32, is there any workaround possible? We also tried to change to float64 in the attacks.py and attacks_tf.py but not much changed, I am sure we missed something somewhere. Also in doing so, we broke other attacks methods like for example CW. Thanks for any insights on this, Cheers, Riccardo
If switching to flaot64 didn't change anything, then isn't it most likely some other problem?
Dear iamgroot42, thanks for the answer. Focusing on the fgsm algorithm, the percentage of not changed images with cleverhans (45%) and the accuracy (50%) are very similar to the code implemented directly in tf with float32. When switching to float64 in tf the problem is solved. That is why I believe the float32 is the issue.
Now, I managed to solve the problem changing float32 to float64 in attacks.py and attacks_tf.py, also I changed the tf.to_float (line 53 in attacks_tf.py) to y = tf.cast(tf.equal(preds, preds_max), dtype=tf.float64). In this way I generate adversarial examples that leads to an accuracy of 5.8%. That seems better now.
I wonder if there is an easy way to specify the desired accuracy in cleverhans. If not, this is something very easy to implement and would be very useful for the users. Now I need to change manually the library everytime I change accuracy, that is not ideal.. I think the best way of doing this would be having the user specify the type of x and y placeholders, agree?
Agreed. Letting the user specify a placeholder would be the best workaround for this.
The line you changed (53 of attacks_tf.py) isn't actually involved in this code snippet. y
is provided by get_or_guess_labels
in attacks.py
I reproduce this problem, but for both float32 and float64
BTW, if you want to make your placeholder, or any other kind of input tensor, just use the generate
method rather than generate_np
.
OK, I had to edit the code in a few more places to make everything float64. With everything successfully float64, I am able to reproduce the problem at float32 and then make it go away by switching to float64.
3.7M entries of the gradient on the input get rounded to 0 for float32, none for float64
Yes, that seems to be exactly the problem I mentioned!
So, I understand from your answer you suggest me to use generate
. I see that in this way I can provide my own placeholder for x, and actually I could also modify construct_graph
to get an extra argument, the type of the x placeholder, in this way I do not need to specify in my external code the construction of the feedable and fixed graph. But I still see some problems:
tf.to_float
standard behaviour is return float32 and there are some instances of this method in the codegenerate
methods there is explicit reference to float32 (e.g. DeepFool, ElasticNetMethod, CarliniWagnerL2, SaliencyMapMethod..). Not sure how critical is to use float64 for these attacks, but using generate with a float64 placeholder might generate conflicts..?I think maybe the best solution would be to pass a tf.type (maybe with a check for supported types, in the beginning only float32 and float64 I guess) to the Attack.__init__
, at that point can be stored in self.type
and reused when needed? (in all the mentioned places 1-2-3)
Or how would you advice me to proceed?
I looked into this a bit more yesterday.
I didn't realize that the machinery for generate_np interfered with the machinery for generate so much. I'm pretty annoyed about that, and I think the solution might just be to remove generate_np.
Another potential solution would be to try to get TensorFlow to support a floatX feature similar to what Theano has. In the meantime we could make a cleverhans.floatX and then phase it out after TensorFlow adds it.
I'm going to start a conversation on cleverhans-dev
Update: it looks like generate_np doesn't actually interfere with generate, though generate_np itself doesn't support float64.
I suggest just sending small PRs like this one whenever you encounter something that doesn't support float64: https://github.com/tensorflow/cleverhans/pull/356
Ok, sure
Thanks for the feedback, I will try to send some small PRs like the one you showed in the next days
I can contribute on this, I will work on it and submit a pull request.. I have two questions, before I modify stuffs I want to be sure I fully understand the architecture so I will be able to properly test it:
Ok, I think for point 1 it was quite easy (correct me if there is a better way):
def nn(x):
return create_logits(x)
sess = tf.Session()
x = tf.placeholder(tf_dtype, shape=[None]+input_shape)
nn_model = CallableModelWrapper(nn, 'logits')
with sess.graph.as_default():
attack = AttackClass(nn_model, 'tf', sess, dtypestr=dtypestr)
x_adv = attack.generate(x, **params)
x_adv_test=sess.run(x_adv, {x:x_test})
The question of point 2 still remains though..
Regarding point 2, you would have to provide a model object from cleverhans.model.Model
if you are going to call generate
from the FastGradientMethod
.
If you'd like to use the graph you already created, it is not recommended, but you can use the function that is called in the backend by FastGradientMethod
, it is here: https://github.com/tensorflow/cleverhans/blob/master/cleverhans/attacks_tf.py#L23
I am closing this for now, feel free to reopen if #395 did not completely address your issue.
It seems that the perturbations computed by the FastGradientMethod (FGM) and BasicIterativeMethod attacks are sometimes equal to zero.
In a first preliminary test (not in cleverhans), this problem was due to the overconfidence of the network. The cross-entropy is thus very flat and the gradient very small, this created an underflow problem. We solved this passing to float64 precision in tensorflow. I noticed Cleverhans is using float32 (specified in several places in the code). I wonder if it is possible to specify easily to use float64 precision instead and/or if you think there is another workaround to that?
We are trying to produce FGSM adversarial examples for MNIST, for example using a simple fully connected architecture of 784-200-10. The accuracy on the trained network is initially around 96% on test set. It drops to 48% on the adversarials generated with FGM {params eps=0.1, ord=np.inf, clip_min=0, clip_max=1}. However we noted that 45% of the MNIST test set examples are not changed by the FGSM attack. Do you know if anyone else had the same issue?