Decomposing improvements in accuracy

For some unpublished work, I have decomposed improvements into this waterfall. Please let me know if this would make sense to include.

Suppose:

Model M trained on X_train classifies data X_test with 90% top-1 accuracy with k=[10] classes.
Attack A on M generates X_test_adv.
M classifies X_test_adv with 1% top-1 accuracy.
Model M_defended applies a defense (e.g. adversarial retraining) and achieve 30% top-1 accuracy.
Model M_seed is identical to M other than a different random seed during training
M_seed achieves nearly identical performance to M on x_test.
M_seed achieves 15% accuracy on X_test_adv due to imperfect transferability of attacks.

Decomposition:

Random Guess: A trivial model that randomly guesses on k=[10] classes would achieve 1/k = [10%] accuracy. The first 9% of accuracy is attributable to a TRIVIAL RANDOM-GUESSING MODEL. This is not a very impressive part of the improvement in accuracy.

Different training seed: M_seed achieves 15% on x_test_adv. The next 5% of improved accuracy is attributable to NON-TRANSFERABILITY. This is also unimpressive.

Defense: M_defended achieves 30% on x_test_adv. Only the final 15% of improvement can be attributed to the defense.

Screen Shot 2020-01-08 at 10 52 27 AM

evaluating-adversarial-robustness / adv-eval-paper

Decomposing improvements in accuracy #24