cleverhans-lab / cleverhans

An adversarial example library for constructing attacks, building defenses, and benchmarking both
MIT License
6.2k stars 1.39k forks source link

Fix generate_np bug #150

Closed goodfeli closed 7 years ago

goodfeli commented 7 years ago

When generate_np is called more than once with different arguments, it just replays the first graph.

I think it would be best to fix this by just removing generate_np, because some keyword arguments need to trigger generation of new graphs and others don't. This seems like too subtle of a point for most Attack implementers to handle.

carlini commented 7 years ago

This would have fairly large consequences for attacks that are not easy to (or perhaps even impossible to) implement fully symbolically.

For example, if you wanted to generate adversarial examples on a few thousand iterations using the attack algorithm I have, or JSMA as currently implemented, you would have to construct thousands of tf.gradient() operations, instead of caching it and just doing one.

goodfeli commented 7 years ago

You can use the while loop to avoid making a large unrolled graph: https://www.tensorflow.org/api_docs/python/tf/while_loop

Or you can wrap python code up to make a tensorflow op: https://www.tensorflow.org/api_docs/python/tf/py_func

On Mon, Jun 5, 2017 at 1:10 PM, Nicholas Carlini notifications@github.com wrote:

This would have fairly large consequences for attacks that are not easy to (or perhaps even impossible to) implement fully symbolically.

For example, if you wanted to generate adversarial examples on a few thousand iterations using the attack algorithm I have, or JSMA as currently implemented, you would have to construct thousands of tf.gradient() operations, instead of caching it and just doing one.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openai/cleverhans/issues/150#issuecomment-306290962, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXrGjAL60GhHq2FBGhfH2NylEeEr2vbks5sBGCdgaJpZM4NwUXf .

carlini commented 7 years ago

Oh, I think I was unclear what I meant by "iterations" -- I didn't mean iterations of the attack algorithm, I meant iterations of calling attack.generate(). If we have to construct a new graph each time, that would be expensive.

goodfeli commented 7 years ago

Oh, for that, it's straightforward, you just feed a different value each time.

On Mon, Jun 5, 2017 at 1:54 PM, Nicholas Carlini notifications@github.com wrote:

Oh, I think I was unclear what I meant by "iterations" -- I didn't mean iterations of the attack algorithm, I meant iterations of calling attack.generate(). If we have to construct a new graph each time, that would be expensive.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openai/cleverhans/issues/150#issuecomment-306306413, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXrGrir-QAPdjs4eD5bZdZa_dGC0GcYks5sBGsGgaJpZM4NwUXf .

carlini commented 7 years ago

This requires re-generating the entire graph each time though, right? You have to pay the cost of doing that every time you call generate(), when ideally you'd only construct the graph once and use it every time after that.

goodfeli commented 7 years ago

No, you generate the graph once and feed many different values through it. The tutorials already do this: https://github.com/openai/cleverhans/blob/master/tutorials/mnist_tutorial_tf.py

On Tue, Jun 6, 2017 at 9:55 AM, Nicholas Carlini notifications@github.com wrote:

This requires re-generating the entire graph each time though, right? You have to pay the cost of doing that every time you call generate(), when ideally you'd only construct the graph once and use it every time after that.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openai/cleverhans/issues/150#issuecomment-306549708, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXrGtubZ1N4YsQOf2ey3QkkIGa_hvzYks5sBYSXgaJpZM4NwUXf .

carlini commented 7 years ago

Yeah, but that only works because the FGS attack is a fully TF graph. Could you do this same thing with the JSMA code without having to call tf.gradients() each time you want a new adversarial example?

goodfeli commented 7 years ago

Yes, by wrapping a python function that does the JSMA code in a tf op using tf.py_func

On Tue, Jun 6, 2017 at 10:30 AM, Nicholas Carlini notifications@github.com wrote:

Yeah, but that only works because the FGS attack is a fully TF graph. Could you do this same thing with the JSMA code without having to call tf.gradients() each time you want a new adversarial example?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openai/cleverhans/issues/150#issuecomment-306559546, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXrGvbKRAczTXh0DjmISc8zrlPEt6KMks5sBYyxgaJpZM4NwUXf .

carlini commented 7 years ago

Oh, I didn't realize that would do that, okay, now I understand what you're saying. Then yes, I agree just removing generate_np would be the easiest solution.

npapernot commented 7 years ago

The JSMA code is already being wrapped with py_func in the generate() method of SaliencyMapMethod https://github.com/openai/cleverhans/blob/master/cleverhans/attacks.py#L356

npapernot commented 7 years ago

To put this issue into context with the previous design choices made. In the Attacks interface PR #75, we discussed how multiple calls to generate_np should be handled here https://github.com/tensorflow/cleverhans/pull/75/files/6b6cf82fd22b63b43c9cd8af790ead7ed99b9195#diff-b52d3b38c185a956192766f2aa9284cd and here https://github.com/tensorflow/cleverhans/pull/75#issuecomment-289622743

So the current implementations of generate_np should generate a graph stored as self._x_adv taking input self.x (for instance https://github.com/tensorflow/cleverhans/blob/master/cleverhans/attacks.py#L80).

If the graph needs additional variables (like the target label placeholder) that is given a different value at each call (like the target label value), the placeholder is stored as an attribute when calling parse_params.

This may not be ideal, so suggestions to improve the behavior of generate_np are welcome

carlini commented 7 years ago

So I see three paths forward.

  1. Ian's recommendation: just delete the generate_np functionality. Make generate the only method, and wrap everything with a py_func if you can't do it symbolically. Since this doesn't have the performance implications I thought it would, this is a reasonable option. The only potential problem with it is that now the interface is slightly less clean for a new user who just wants to generate an attack, since they'll have to create a tf placeholder.

  2. Make generate_np call generate() every time it receives arguments. Don't save the graph. This is potentially very slow to do, but we could document it with a big warning saying something to that effect. This method would be used only for one-off adversarial example generation and not if you want to have it in a tight loop (which is probably already the case).

  3. Make generate_np call generate() every time it receives a new set of arguments. Cache the graph objects we've used so far. If the arguments are the same as a previous computation, then re-use that older graph. If not, make a new graph. This makes it as efficient as generate() while keeping the nice easy-to-use functionality. The only difficulty is that now everyone who wants to implement an attack has to handle this. There might be some nice way to abstract this into the Attack super class, though, that makes it easier to handle.

Personally, I like option (3) followed by (1) and then (2). I like the generate_np() method as a nice simple interface for someone who just wants an adversarial example, and if this comes at the cost of some more implementation work, so be it.

But whatever we do, I do think that the current behavior of using old arguments is counterintuitive at best and should be changed.

goodfeli commented 7 years ago

I think everyone agrees the current behavior is a bug.

I would do 3 if I were working alone, but it seems unlikely that we'd be able to efficiently explain the situation to all future Attack subclass implementers.

Subclass implementers will probably do the right thing if their implementation really is numpy based. If their implementation is based on wrapping symbolic code, they'll probably leave out the caching mechanism or implement the caching mechanism incorrectly.

We could make a default implementation of the correct caching mechanism in the Attack superclass, but this would require imposing some hard rules (like that every value in kwargs is fed into a placeholder, and every argument that changes the graph structure must go into init). This makes the implementation of generate more annoying. I don't think that generate_np is important enough that we should redesign generate to make it feasible.

On Tue, Jun 6, 2017 at 7:02 PM, Nicholas Carlini notifications@github.com wrote:

So I see three paths forward.

1.

Ian's recommendation: just delete the generate_np functionality. Make generate the only method, and wrap everything with a py_func if you can't do it symbolically. Since this doesn't have the performance implications I thought it would, this is a reasonable option. The only potential problem with it is that now the interface is slightly less clean for a new user who just wants to generate an attack, since they'll have to create a tf placeholder. 2.

Make generate_np call generate() every time it receives arguments. Don't save the graph. This is potentially very slow to do, but we could document it with a big warning saying something to that effect. This method would be used only for one-off adversarial example generation and not if you want to have it in a tight loop (which is probably already the case). 3.

Make generate_np call generate() every time it receives a new set of arguments. Cache the graph objects we've used so far. If the arguments are the same as a previous computation, then re-use that older graph. If not, make a new graph. This makes it as efficient as generate() while keeping the nice easy-to-use functionality. The only difficulty is that now everyone who wants to implement an attack has to handle this. There might be some nice way to abstract this into the Attack super class, though, that makes it easier to handle.

Personally, I like option (3) followed by (1) and then (2). I like the generate_np() method as a nice simple interface for someone who just wants an adversarial example, and if this comes at the cost of some more implementation work, so be it.

But whatever we do, I do think that the current behavior of using old arguments is counterintuitive at best and should be changed.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/cleverhans/issues/150#issuecomment-306664633, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXrGu4F0axtlAYYkfkaf_E7tAHyUEiCks5sBgS7gaJpZM4NwUXf .

npapernot commented 7 years ago

It sounds like the best option is to deprecate generate_np and remove it in 6 months.

Should we issue a warning when the user calls generate_np multiple times to warn that it will reuse the first graph?

goodfeli commented 7 years ago

Actually, here's another thought:

generate_np is OK if it's actually implemented in NumPy. I don't think there's any known bug with the default implementation of generate that wraps generate_np. What if we keep generate_np as a method that subclasses can optionally implement if they want to make the default implementation of generate wrap their NumPy implementation? This would mean the user can't expect generate_np to work for every class, but some classes that are most naturally implemented in NumPy could still use it.

On Wed, Jun 7, 2017 at 10:17 AM, Nicolas Papernot notifications@github.com wrote:

It sounds like the best option is to deprecate generate_np and remove it in 6 months.

Should we issue a warning when the user calls generate_np multiple times to warn that it will reuse the first graph?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/cleverhans/issues/150#issuecomment-306863852, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXrGnOLkn1Re7fCZjlefzCx7SaSW9WPks5sBttFgaJpZM4NwUXf .

carlini commented 7 years ago

If the interface is going to exist, then I would argue it should exist for all attacks. I think the API would be more confusing if some attacks implement generate_np and some don't.

Another possible option would be to require the method exists, but leave it up to the attack implementor as to whether they just call generate() fresh every time (and pay the performance hit) or if they do some graph caching.

npapernot commented 7 years ago

It might make sense to have two abstract classes inherit from Attack: SymbolicAttack and NonSymbolicAttack. Non-abstract attack classes who inherit from one of these two rather than Attack.

The two abstract classes would make it easier for developers/users to understand the different implementations required by each type of attack and the implications in terms of expected behavior.

The NonSymbolicAttack child classes would implement generate_np only (using Numpy), and generate would be constructed by wrapping generate_np in the NonSymbolicAttack class definition.

The SymbolicAttack child classes would implement generate only.

carlini commented 7 years ago

If NonSymbolicAttack has a generate method (that wraps generate_np) why not have SymbolicAttack have a generate_np method (that wraps generate)?

npapernot commented 7 years ago

I was proposing this scheme with the solution of having only some classes implement generate_np in mind. If we decide to go the other route, and have all classes implement generate_np, then it does not necessarily apply anymore.

carlini commented 7 years ago

That makes sense.

goodfeli commented 7 years ago

I set up an in-person meeting with Nicolas and Nicholas to discuss this in more detail. If anyone would like to join the meeting (e.g. via hangouts) reply and let me know. We'll write back to the list with our conclusions.

On Wed, Jun 7, 2017 at 6:09 PM, Nicholas Carlini notifications@github.com wrote:

That makes sense.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/cleverhans/issues/150#issuecomment-306969151, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXrGs15KyDj0OlwSEcqo2IAIDWWgiCjks5sB0nMgaJpZM4NwUXf .

carlini commented 7 years ago

I have started implementing this in my PR. The Attack class is now up to date and FGSM works with it. I will now continue with updating the other attacks.

carlini commented 7 years ago

PR #198 now implements this fix.