Closed aiskumo closed 4 years ago
Hey, have you made sure that the model you are passing into the fast_gradient_method function outputs the logits instead of the values as passed through an activation function like 'softmax' in the last layer? Maybe that's the problem?... If not you could perhaps try to pass in the true labels as an array of size (5) instead of a 2D (1,5) tensor? Hope this helps! What does the model that you're using look like?
Hi, thanks for the response! Actually initially I was passing the input label as an array of size (5,) during which I got this error:
ValueError: Shape mismatch: The shape of labels (received (5,)) should equal the shape of logits except for the last dimension (received (1, 5)).
So I decided to pass the true labels as np.reshape(sample_input_label, (1,5)) after which I got the rank mismatch error.
I'm using a fully-connected, feed-forward, classifier with 3 layers of ReLUs, and one softmax layer. It has 25 inputs, and 5 classes. I'm curious why it doesn't seem to accept any (1,5) array or (5,1) or (5,) array as true labels when the one hot encoded labels is listed as a requirement, but instead accepts a single value? Perhaps I'm missing something here.
Have you tried modifying the model so that you get the logits from the final layer as the output instead of the values outputted by the 'softmax' function? I copy-pasted your error code into google and a bunch of other issues popped up where that same error code appeared after trying to use one of tensorflow's loss functions... You could maybe try that and see if it works. But yes, you're right; I have this guide I made in July 2019 (https://colab.research.google.com/drive/14Xoh7P_cLVagaT1JqZgbD6WvXKvoG7F7) where I copy-pasted the docs for the function at the time and they don't appear to have the note specifying that the 'y' parameter must be one-hot encoded. In the guide I use the labels in the shape (1,) and it works.
Thank you so much; I'll try modifying the model. I believe the docs were recently updated since it shows one-hot encoded for 'y' parameter now which just seems a bit confusing to me.
Thank you once again for your suggestions, I'll implement them!
I just tried this sample_input with
logits = new_model.predict(np.reshape(sample_input, (1,25))) #shape (1,5)
adversarial_input = fast_gradient_method(model_fn=new_model, x=np.reshape(sample_input,(1,25)), y=logits.astype('int64'), eps =eps, norm= np.inf)
but it seems I get the same Rank Mismatch error here too, even when logit is of shape (1,5). With (5,) shape I still get a "expected (1,5)" type error. I'm not quite sure why this happens.
Also I believe Cleverhans uses Tensorflow's softmax_cross_entropy_with_logits_v2. In the documentation of this, it specifies about the labels:
Each vector along the class dimension should hold a valid probability distribution e.g. for the case in which labels are of shape [batch_size, num_classes], each row of labels[i] must be a valid probability distribution.
So by this logic, the code should work with (1,5) as I'm passing a single one hot encoded sample here, so it's a bit weird.
Sorry, yeah... I just replicated your error, although it was using the fast_gradient_method
function imported from cleverhans.future.tf2.attacks
. I can't seem to get it to work. To me, it seems as though it isn't actually supported for one-hot encoded labels...
For now, it looks like it works fine with the labels as indices.
Yeah I'm using the same fast_gradient_method
from the cleverhans.future.tf2.attacks
package and got this error. It just seems misleading to specify in the documentation as "labels should be one-hot encoded labels."
Right... Notice, however, that in your original message you were referencing the docs for the fast_gradient_method
located in cleverhans/attacks
instead of the tf2
version.
This error goes away when I enter some random scalar value [16] in y, like
y=[16]
. I'm not sure why this happens. The documentation of FastGradientMethod clearly specifies that the labels should be one hot encoded.System configuration
- System: Google Colab
- Python version: 3.6.9
- TensorFlow version: 2.3.0
If you look at the documentation for the method that you are using from cleverhans/future/tf2/attacks
here you'll see that it doesn't explicitly specify that the labels should be one-hot encoded, insinuating they be indices.
I hope this helps! Just some confusion with the docs.
I see. Thank you so much for clearing it up!
The bug When I try to give true labels of say shape (1,5) to
fast_gradient_method
, it produces an error:ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).
This is how I'm trying to implement FastGradientMethod on a sample input of shape (1,25) and one hot encoded labels of shape (1,5).
Sample input with shape (1,25):
One-hot encoded sample input labels with shape (1,5):
[[1. 0. 0. 0. 0.]]
When I try to implement FastGradientMethod with a pre-trained model
new_model
using:adversarial_input = fast_gradient_method(model_fn=new_model, x=np.reshape(sample_input,(1,25)), y=np.reshape(sample_input_label, (1,5)), eps =0.3, norm= np.inf)
I get this error:
ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).
which is weird, because the true labels is of shape (1,5) and the computed labels of the model is shape (1,5) as well.This error goes away when I enter some random scalar value [16] in y, like
y=[16]
. I'm not sure why this happens. The documentation of FastGradientMethod clearly specifies that the labels should be one hot encoded.System configuration