openphilanthropy / unrestricted-adversarial-examples

Contest Proposal and infrastructure for the Unrestricted Adversarial Examples Challenge
Apache License 2.0
327 stars 62 forks source link

Has this contest been abandoned ? #85

Closed lemairecarl closed 2 years ago

lemairecarl commented 3 years ago

I'm getting problems while trying to install unrestricted-advex, it seems that the installation steps are not up to date. For instance, there is a requirement for tensorflow>=1.0.0. It's not clear if it's only compatible with TF 1.x. It does not seem to work with TF 2.4.1.

When trying to run the following for the warmup:

eval_kit.evaluate_bird_or_bicycle_model(my_very_robust_model)

I get:

Traceback (most recent call last):
  File "/home/carl/source/advattk/op_warmup.py", line 2, in <module>
    from unrestricted_advex import eval_kit
  File "/home/carl/source/unrestricted-adversarial-examples/unrestricted-advex/unrestricted_advex/eval_kit.py", line 14, in <module>
    from unrestricted_advex.mnist_baselines import mnist_utils
  File "/home/carl/source/unrestricted-adversarial-examples/unrestricted-advex/unrestricted_advex/mnist_baselines/mnist_utils.py", line 8, in <module>
    from tensorflow.examples.tutorials.mnist import input_data
ModuleNotFoundError: No module named 'tensorflow.examples'

There haven't been any activity since about a year, and the README mentions that there was to be an update soon, but it didn't came. So I'm left supposing that this contest has been abandoned. Can you please confirm if that's the case?

crsegerie commented 2 years ago

Oh that's a shame,

I think this contest was very original and has the potential to bring to the community. Is it possible to find a post mortem somewhere to understand the difficulties that led to the abandonment of the competition?

carlini commented 2 years ago

I can write something up longer later, but to tell the story briefly: Tom and Katherine went on to do bigger things than this contest (GPT-3 anyone?) and I got lazy.

Maybe less briefly on my side because I only have information about me: as with everything there's a cost/benefit tradeoff to doing work on any given topic. Currently the costs (for me) are high, and the benefits (as the appear to me) are low.

Costs. Running contests takes time. Both to set up and to keep running. The setting things up is where things are stalled now, where I have to try and draft rules that (a) make the contest good for researchers, and (b) make the contest something lawyers are willing to stand behind for awarding money. You can't just say "we're going to give $$$$ to whoever we want", that's bad. So you have to spell out who wins. What are the conditions? What if someone doesn't like them? What if someone cheats?

But even if this gets fixed, the costs then shift to running things continuously. Who keeps things up to date? Who reviews submissions and handles logistics? I only have so much time and this would eat into it quite a bit.

Benefits. All of the above costs could easily be "worth it" if the potential benefit here was high. The reason why I don't think this is the case is that, to be honest, I don't have any faith that anything remotely secure will work in the next several years. We're like 3 orders of magnitude away from robust classification on perturbation budgets alone, and then another 3 orders of magnitude away on the accuracy of classifiers. Like 90% of defenses that get published at top venues can be broken in a few hours of effort.

I believed this before, too, and I was happy to run it because I thought it would encourage more research on unrestricted attacks and defenses. But in the same way that giving people $$$ every time they purport a proof that P=NP (and then $$ to the first person to find a flaw) will probably strictly increase the number of people working in this field, it's not obvious that it would make things actually solved a huge amount faster.

The other benefit to this contest in my mind at the time is that it would have given us something to point to to say "look, see how hard things are, we can't even classify birds vs bicycles!" but I think here the ML community has basically come to understand these things are brittle. We don't really need a contest to show that.

The bottom line. I still think there is immense value in this challenge existing, and people trying to solve it. I'm just not sure if the value (for me) of running a contest around the challenge is worth it. If someone else wanted to make this happen, I still think it would be a net positive for the community. More people working on this problem would be good. I think this contest has the right setup, more than any other contest. Feel free to email me if you're reading this in the future and you're that someone. I'd be happy to spend some amount of time helping someone else make this happen.

lemairecarl commented 2 years ago

Thanks for writing this, it's appreciated.