Adversarial training definition

agitter commented 6 years ago

@cgreene suggested that we improve our adversarial training definition in the imaging applications section. See Ian Goodfellow's tweet and Casey's tweet for reference.

Our definition in Table 1 may be better, but we can consider updating that as well to be consistent.

cgreene commented 6 years ago

More good thoughts on the topic: https://twitter.com/catherineols/status/984583946129166337

evancofer commented 6 years ago

@cgreene This seems like a reasonable improvement, and I favor the more precise language.

Domain-specific semantics like this are important, but unambiguous and up-to-date definitions could be hard to trace. Is there a more permanent source (than a tweet or a blog post), perhaps a review, that we can cite regarding this definition?

stephenra commented 6 years ago

Here's Ian's tweet:

The definition of "adversarial examples" I prefer these days is "Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake"

And the excerpt from the Imaging section:

Adversarial training examples are constructed by selecting targeted small transformations to input data that cause a model to produce very different outputs.

And the 'Adversarial training' section from Table 1:

A process by which artificial training examples are maliciously designed to fool a NN and then input as training examples to make the resulting NN robust (no relation to GANs)

I think the main two issues with the Imaging section definition are that 1. it focuses on the mechanism of construction whereas it could stand to highlight intent (e.g. "Adversarial training examples are deliberately constructed...") and 2. that it focuses on small transformations. There's a point made by Ian further in the Twitter thread where examples aren't necessarily small perturbations. Earlier work in adversarial examples/training, notably Ian's paper, emphasized constraints on minimizing or imposing constraints on ε , the size of the adversarial perturbation, but work since then has experimented with varying ε.

@evancofer Unfortunately, much like 'disentangled representations', I don't think there is any real unified, formal definition just yet.

cgreene commented 6 years ago

I wonder if adversarial training could benefit from a broader focus. One thing we don't really have, though it does come up a little bit in the discussion, is "general techniques/ideas to watch". Would this be a worthwhile set of things to unify closer to the front (maybe even an expansion of the intro), since probably a fair number of these techniques will come up in the bio-focused parts of the paper.

Adversarial examples, data augmentation, various forms of regularization all come to mind.

cgreene commented 6 years ago

I guess I'm wondering if we should have: "the things you probably should think about as you're considering trying to get this stuff to actually work in bio/medicine." And then we can more briefly mention these techniques when they actually get used later.

Something roughly like the idea of the last part of this presentation: https://youtu.be/-PuchwaPdPg?t=29m50s

greenelab / deep-review

Adversarial training definition #861