cleverhans-lab / cleverhans

An adversarial example library for constructing attacks, building defenses, and benchmarking both
MIT License
6.2k stars 1.39k forks source link

Projected gradient descent code seems not working #1051

Closed ywu36 closed 5 years ago

ywu36 commented 5 years ago

Describe the bug In PGD (https://github.com/tensorflow/cleverhans/blob/master/cleverhans/attacks/projected_gradient_descent.py), line 82: adv_x = x + eta, which initialize adv_x with a random perturbation. I figured out the gradient of adv_x cannot be computed based on this line of code, here is a check:

tf.gradients(self.model.get_logits(adv_x),adv_x)

It will generate gradient [NONE].

What I trying to show here is to explain "adv_x = x + eta" will lead an error in the later function FGM.generate(adv_x, **fgm_params) where the author computes the gradient using this function: tf.gradients(loss, adv_x). However, because loss is a function of x not adv_x, it will lead to [NONE] as the computed gradient and cause error later.

To Reproduce Steps to reproduce the behavior:

  1. Set a break point after the line: adv_x = x + eta
  2. Run: tf.gradients(self.model.get_logits(adv_x),adv_x)
  3. Return [None]

To Reproduce (even simpler)

  1. Run: tf.gradients(x+1,x+1)
  2. Return [None]

Expected behavior The gradient should not be None, otherwise FGM will fail later.

System configuration

ftramer commented 5 years ago

I don't think you can take gradients with respect to operations (adv_x or x+1 in your simpler example represent operations on tensors). Taking the gradients with respect to the original variable x works though.

ywu36 commented 5 years ago

Hi ftramer,

Thanks for the reply. It exactly what I thought it would be.

tf.gradients(self.model.get_logits(adv_x),adv_x): not working tf.gradients(self.model.get_logits(adv_x),x): works,

but unfortunately, the author sent adv_x to the function: FGM.generate(adv_x, **fgm_params), which will later compute tf.gradients(loss, adv_x), but "loss" is a function of x not adv_x. So the code does not go through.

ftramer commented 5 years ago

This works because of the semantics of tf.while_loop. In each loop iteration, you're computing gradients with respect to the previous iteration of the variable x_adv. Do you actually get an error when you run the full PGD attack? All the tests pass on my side.

ywu36 commented 5 years ago

Yes, here is something I'm running:

pgd = ProjectedGradientDescent(model)
pgd_params = {'clip_min': 0.,
                          'clip_max': 1.}
adv_x = pgd.generate(model.input, **pgd_params)

And it reports an error. It seems just a standard way to initialize the attack.

I am not fully convinced by tf.while_loop will make it works. In the first iteration, gradient of "loss" need to computed w.r.t. the "adv_x", but the node name of "adv_x" in the graph is "add:0", but the name of node "x" is "input:0". Because of "adv_x" is not included in the graph. It's gradient cannot be computed.

BTW, what test you are running?

ftramer commented 5 years ago

What is model.input ? Your model should be an instance of the cleverhans.Model class. This works on my end:

model = utils_keras.cnn_model()
model_wrapper = utils_keras.KerasModelWrapper(model)
pgd = ProjectedGradientDescent(model_wrapper)
adv_x = pgd.generate(model.input, **pgd_params)
ywu36 commented 5 years ago

Hi ftramer,

The model I'm using is the faceNet model used here:

https://github.com/tensorflow/cleverhans/blob/master/examples/facenet_adversarial_faces/facenet_fgsm.py The model is inherited from a Cleverhans.model.

npapernot commented 5 years ago

Hi @ywu36, the code you refer to does not support model.input syntax you used in your code snippet. Are you sure this is the model you are having a problem with? If you are still facing this issue, could you reopen this issue and share the code for defining model in your previous code snippet?

Jeevi10 commented 4 years ago

I have exactly same problem now. I cannot compute gradients for iterative method. It gives gradients None and couldn't proceed further. But I was able run exactly same settings with FSGM and craft adversarial examples.