openphilanthropy / unrestricted-adversarial-examples

Contest Proposal and infrastructure for the Unrestricted Adversarial Examples Challenge
Apache License 2.0
329 stars 55 forks source link

Unrestricted Adversarial Examples Challenge Build Status

In the Unrestricted Adversarial Examples Challenge, attackers submit arbitrary adversarial inputs, and defenders are expected to assign low confidence to difficult inputs while retaining high confidence and accuracy on a clean, unambiguous test set. You can learn more about the motivation and structure of the contest in our recent paper

This repository contains code for the warm-up to the challenge, as well as the public proposal for the contest. We are currently accepting defenses for the warm-up.

image

Current Status (Updated April 2020)

The latest submission by Chongli Qin et al has claimed to solve the warm-up to the challenge. We are verifying the submission with our advisory board, and preparing to launch the full-fledged version of the contest.

Leaderboard for the warm-up to the contest

We include three attacks in the warm-up to the contest:

The top few distinct models for each dataset are shown below. You can see all submissions in the full scoreboard.

Two-Class MNIST dataset

Defense Submitted by Clean data Spatial grid attack SPSA attack Boundary attack Submission Date Open Source
MadryPGD LeNet Baseline Google Brain 100.0% 0% 19.6% 0% Sept 14th, 2018 Yes
Undefended LeNet Baseline Google Brain 100.0% 0% 0% 0% Sept 14th, 2018 Yes

All percentages above correspond to the model's accuracy at 80% coverage.

Bird or Bicycle dataset

Defense Submitted by Clean data Common corruptions Spatial grid attack SPSA attack Boundary attack Submission Date Open Source
LLR_ADV_TRAIN Chongli Qin & Jonathan Uesato 100.0% 100.0% 100.0% 100.0% 100.0% Dec 14th, 2019 Yes
TRADESv2 Hongyang Zhang (CMU) & Xin Li (Lehigh Univ.) 100.0% 100.0% 99.5% 100.0% 95.0% Jan 17th, 2019 No
Keras ResNet
(trained on ImageNet)
Google Brain 100.0% 99.2% 92.2% 1.6% 4.0% Sept 29th, 2018 Yes
Pytorch ResNet
(trained on bird-or-bicycle extras)
Google Brain 98.8% 74.6% 49.5% 2.5% 8.0% Oct 1st, 2018 Yes

All percentages above correspond to the model's accuracy at 80% coverage.

Submitting a defense for the warm-up

The warm-up before the contest is currently underway and is accepting submissions. If you have additional questions, feel free to submit a new GitHub issue with the "question" tag and we will respond shortly.

The contest

The contest phase will begin after the warm-up attacks have been conclusively solved. We have published the contest proposal and are soliciting feedback from the community.

Paper

You can learn more about the motivation and structure of the contest in our recent paper:

Unrestricted Adversarial Examples
Tom B. Brown, Nicholas Carlini, Chiyuan Zhang, Catherine Olsson, Paul Christiano and Ian Goodfellow
Arxiv preprint

@article{unrestricted_advex_2018,
  title = {Unrestricted Adversarial Examples},
  author = {{Brown}, T.~B. and {Carlini}, N. and {Zhang}, C. and {Olsson}, C. and 
      {Christiano}, P. and {Goodfellow}, I.},
  journal={arXiv preprint arXiv:1809.08352},
  year={2018}
}