cleverhans-lab / cleverhans

An adversarial example library for constructing attacks, building defenses, and benchmarking both
MIT License
6.2k stars 1.39k forks source link

FastGradientMethod 'y' argument does not support one hot encoded labels. #1182

Closed aiskumo closed 4 years ago

aiskumo commented 4 years ago

The bug When I try to give true labels of say shape (1,5) to fast_gradient_method, it produces an error: ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).

This is how I'm trying to implement FastGradientMethod on a sample input of shape (1,25) and one hot encoded labels of shape (1,5).

Sample input with shape (1,25):

[-0.78028318 -0.32011849 -0.63403654 -0.64475069  0.49296689  0.50249254
  0.5269165  -0.45808821 -0.529135    0.23667459  0.56318829 -1.14682274
 -0.29764395 -0.49727663 -0.67680625  0.20947455  0.50292429 -0.12424306
 -0.1677296  -2.0878724  -0.04599665  0.54236943  0.18448958 -0.54728301
 -0.16232216] 

One-hot encoded sample input labels with shape (1,5): [[1. 0. 0. 0. 0.]]

When I try to implement FastGradientMethod with a pre-trained model new_model using: adversarial_input = fast_gradient_method(model_fn=new_model, x=np.reshape(sample_input,(1,25)), y=np.reshape(sample_input_label, (1,5)), eps =0.3, norm= np.inf)

I get this error: ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2). which is weird, because the true labels is of shape (1,5) and the computed labels of the model is shape (1,5) as well.

This error goes away when I enter some random scalar value [16] in y, like y=[16]. I'm not sure why this happens. The documentation of FastGradientMethod clearly specifies that the labels should be one hot encoded.

System configuration

axantillon commented 4 years ago

Hey, have you made sure that the model you are passing into the fast_gradient_method function outputs the logits instead of the values as passed through an activation function like 'softmax' in the last layer? Maybe that's the problem?... If not you could perhaps try to pass in the true labels as an array of size (5) instead of a 2D (1,5) tensor? Hope this helps! What does the model that you're using look like?

aiskumo commented 4 years ago

Hi, thanks for the response! Actually initially I was passing the input label as an array of size (5,) during which I got this error: ValueError: Shape mismatch: The shape of labels (received (5,)) should equal the shape of logits except for the last dimension (received (1, 5)). So I decided to pass the true labels as np.reshape(sample_input_label, (1,5)) after which I got the rank mismatch error. I'm using a fully-connected, feed-forward, classifier with 3 layers of ReLUs, and one softmax layer. It has 25 inputs, and 5 classes. I'm curious why it doesn't seem to accept any (1,5) array or (5,1) or (5,) array as true labels when the one hot encoded labels is listed as a requirement, but instead accepts a single value? Perhaps I'm missing something here.

axantillon commented 4 years ago

Have you tried modifying the model so that you get the logits from the final layer as the output instead of the values outputted by the 'softmax' function? I copy-pasted your error code into google and a bunch of other issues popped up where that same error code appeared after trying to use one of tensorflow's loss functions... You could maybe try that and see if it works. But yes, you're right; I have this guide I made in July 2019 (https://colab.research.google.com/drive/14Xoh7P_cLVagaT1JqZgbD6WvXKvoG7F7) where I copy-pasted the docs for the function at the time and they don't appear to have the note specifying that the 'y' parameter must be one-hot encoded. In the guide I use the labels in the shape (1,) and it works.

aiskumo commented 4 years ago

Thank you so much; I'll try modifying the model. I believe the docs were recently updated since it shows one-hot encoded for 'y' parameter now which just seems a bit confusing to me.

Thank you once again for your suggestions, I'll implement them!

aiskumo commented 4 years ago

I just tried this sample_input with

logits = new_model.predict(np.reshape(sample_input, (1,25))) #shape (1,5)
adversarial_input = fast_gradient_method(model_fn=new_model, x=np.reshape(sample_input,(1,25)), y=logits.astype('int64'), eps =eps, norm= np.inf)

but it seems I get the same Rank Mismatch error here too, even when logit is of shape (1,5). With (5,) shape I still get a "expected (1,5)" type error. I'm not quite sure why this happens.

aiskumo commented 4 years ago

Also I believe Cleverhans uses Tensorflow's softmax_cross_entropy_with_logits_v2. In the documentation of this, it specifies about the labels:

Each vector along the class dimension should hold a valid probability distribution e.g. for the case in which labels are of shape [batch_size, num_classes], each row of labels[i] must be a valid probability distribution.

So by this logic, the code should work with (1,5) as I'm passing a single one hot encoded sample here, so it's a bit weird.

axantillon commented 4 years ago

Sorry, yeah... I just replicated your error, although it was using the fast_gradient_method function imported from cleverhans.future.tf2.attacks. I can't seem to get it to work. To me, it seems as though it isn't actually supported for one-hot encoded labels...

For now, it looks like it works fine with the labels as indices.

aiskumo commented 4 years ago

Yeah I'm using the same fast_gradient_method from the cleverhans.future.tf2.attacks package and got this error. It just seems misleading to specify in the documentation as "labels should be one-hot encoded labels."

axantillon commented 4 years ago

Right... Notice, however, that in your original message you were referencing the docs for the fast_gradient_method located in cleverhans/attacks instead of the tf2 version.

This error goes away when I enter some random scalar value [16] in y, like y=[16]. I'm not sure why this happens. The documentation of FastGradientMethod clearly specifies that the labels should be one hot encoded.

System configuration

  • System: Google Colab
  • Python version: 3.6.9
  • TensorFlow version: 2.3.0

If you look at the documentation for the method that you are using from cleverhans/future/tf2/attacks here you'll see that it doesn't explicitly specify that the labels should be one-hot encoded, insinuating they be indices.

I hope this helps! Just some confusion with the docs.

aiskumo commented 4 years ago

I see. Thank you so much for clearing it up!