val-iisc / NuAT

Towards Efficient and Effective Adversarial Training, NeurIPS 2021
MIT License
16 stars 1 forks source link

Towards Efficient and Effective Adversarial Training

This repository contains code for the implementation of our NeurIPS 2021 paper Towards Efficient and Effective Adversarial Training. Accompanying resources can be found here: [video] [poster]

Trained model checkpoints can be found here.

Nuclear Norm Adversarial Training (NuAT)

In this work, we propose a novel Nuclear Norm regularizer to improve the adversarial robustness of Deep Networks through the use of single-step adversarial training. Training with the proposed Nuclear Norm regularizer enforces function smoothing in the vicinity of clean samples by incorporating joint batch-statistics of adversarial samples, thereby resulting in enhanced robustness.

Nuclear-Norm based Attack: In a given minibatch, we consider to be the matrix composed of vectorized pixel values (arranged row-wise) of each image, to be a matrix of the same dimension as consisting of independently sampled Bernoulli noise, and to be the matrix containing the corresponding ground truth one-hot vectors. The following loss function which utilizes the pre-softmax values is maximized for the generation of single-step adversaries:

In single-step Nuclear Norm Adversarial Training (NuAT), the following loss function is minimized during training:

The first term in the above equation corresponds to the cross-entropy loss on clean samples, and the second term corresponds to the Nuclear-Norm of the difference in pre-softmax values of clean images and their corresponding single-step adversaries .

Results on CIFAR-10

We summarise the robust white-box evaluations on the CIFAR-10 dataset below. We kindly refer the reader to our paper for further details on the two-step adversarial training method (NuAT2), exponential weight averaging (NuAT-WA) and Hybrid Nuclear Norm Adversarial Training (NuAT-H).

Environment Settings

Citing this work

@inproceedings{
sriramanan2021towards,
title={Towards Efficient and Effective Adversarial Training},
author={Gaurang Sriramanan and Sravanti Addepalli and Arya Baburaj and Venkatesh Babu Radhakrishnan},
booktitle={Advances in Neural Information Processing Systems},
editor={A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
year={2021},
url={https://openreview.net/forum?id=kuK2VARZGnI}
}