This is the TensorFlow implementation for the paper Meta Dropout: Learning to Perturb Latent Features for Generalization (ICLR 2020) : https://openreview.net/forum?id=BJgd81SYwr.
You can reproduce the results of Table 1 in the main paper.
A machine learning model that generalizes well should obtain low errors on unseen test examples. Thus, if we know how to optimally perturb training examples to account for test examples, we may achieve better generalization performance. However, obtaining such perturbation is not possible in standard machine learning frameworks as the distribution of the test data is unknown. To tackle this challenge, we propose a novel regularization method, meta-dropout, which learns to perturb the latent features of training examples for generalization in a meta-learning framework. Specifically, we meta-learn a noise generator which outputs a multiplicative noise distribution for latent features, to obtain low errors on the test instances in an input-dependent manner. Then, the learned noise generator can perturb the training examples of unseen tasks at the meta-test time for improved generalization. We validate our method on few-shot classification datasets, whose results show that it significantly improves the generalization performance of the base model, and largely outperforms existing regularization methods such as information bottleneck, manifold mixup, and information dropout.
If you are not familiar with preparing conda environment, please follow the below instructions:
$ conda create --name py35 python=3.5
$ conda activate py35
$ pip install --upgrade pip
$ pip install tensorflow-gpu==1.12.0
$ conda install -c anaconda cudatoolkit=9.0
$ conda install -c anaconda cudnn
And for data downloading,
$ pip install tqdm
$ pip install requests
$ python get_data.py --dataset omniglot
$ python get_data.py --dataset mimgnet
It will take some time to download each of the datasets.
Omniglot 1-shot experiment
# Meta-training / Meta-testing
$ python main.py --gpu_id 0 --savedir './results/metadrop/omni_1shot' --dataset 'omniglot' --mode 'meta_train' --metabatch 4 --n_steps 5 --inner_lr 0.1 --way 20 --shot 1 --query 15 --n_train_iters 60000 --meta_lr 3e-4 --n_test_mc_samp 1
$ python main.py --gpu_id 0 --savedir './results/metadrop/omni_1shot' --dataset 'omniglot' --mode 'meta_test' --metabatch 1 --n_steps 5 --inner_lr 0.1 --way 20 --shot 1 --query 15 --n_train_iters 60000 --meta_lr 3e-4 --n_test_mc_samp 30
Omniglot 5-shot experiment
# Meta-training / Meta-testing
$ python main.py --gpu_id 0 --savedir './results/metadrop/omni_5shot' --dataset 'omniglot' --mode 'meta_train' --metabatch 4 --n_steps 5 --inner_lr 0.4 --way 20 --shot 5 --query 15 --n_train_iters 60000 --meta_lr 1e-3 --n_test_mc_samp 1
$ python main.py --gpu_id 0 --savedir './results/metadrop/omni_5shot' --dataset 'omniglot' --mode 'meta_test' --metabatch 1 --n_steps 5 --inner_lr 0.4 --way 20 --shot 5 --query 15 --n_train_iters 60000 --meta_lr 1e-3 --n_test_mc_samp 30
miniImageNet 1-shot experiment
# Meta-training / Meta-testing
$ python main.py --gpu_id 0 --savedir './results/metadrop/mimgnet_1shot' --dataset 'mimgnet' --mode 'meta_train' --metabatch 4 --inner_lr 0.01 --n_steps 5 --way 5 --shot 1 --query 15 --n_train_iters 60000 --meta_lr 1e-4 --n_test_mc_samp 1
$ python main.py --gpu_id 0 --savedir './results/metadrop/mimgnet_1shot' --dataset 'mimgnet' --mode 'meta_test' --metabatch 1 --inner_lr 0.01 --n_steps 5 --way 5 --shot 1 --query 15 --n_train_iters 60000 --meta_lr 1e-4 --n_test_mc_samp 30
miniImageNet 5-shot experiment
# Meta-training / Meta-testing
$ python main.py --gpu_id 0 --savedir './results/metadrop/mimgnet_5shot' --dataset 'mimgnet' --mode 'meta_train' --metabatch 4 --inner_lr 0.01 --n_steps 5 --way 5 --shot 5 --query 15 --n_train_iters 60000 --meta_lr 1e-4 --n_test_mc_samp 1
$ python main.py --gpu_id 0 --savedir './results/metadrop/mimgnet_5shot' --dataset 'mimgnet' --mode 'meta_test' --metabatch 1 --inner_lr 0.01 --n_steps 5 --way 5 --shot 5 --query 15 --n_train_iters 60000 --meta_lr 1e-4 --n_test_mc_samp 30
Visualization needs the following additional package.
$ pip install matplotlib sklearn
First, export necessary statistics by changing --mode
into export
.
For example,
$ python main.py --gpu_id 0 --savedir './results/metadrop/omni_1shot' --dataset 'omniglot' --mode 'export' --metabatch 1 --n_steps 5 --inner_lr 0.1 --way 20 --shot 1 --query 15 --n_train_iters 60000 --meta_lr 1e-3 --n_test_mc_samp 30
Then, run plot.py
with --savedir
argument.
For example,
$ python plot.py --savedir './results/metadrop/omni_1shot'
This will generate decision boundary plots under plot
directory in the savedir
.
The results in the main paper (average over 1000 episodes, with a single run): | Omni. 1shot | Omni. 5shot | mImg. 1shot | mImg. 5shot | |
---|---|---|---|---|---|
MAML | 95.23±0.17 | 98.38±0.07 | 49.58±0.65 | 64.55±0.52 | |
Meta-dropout | 96.63±0.13 | 98.73±0.06 | 51.93±0.67 | 67.42±0.52 |
The results from running this repo (average over 1000 episodes, with a single run): | Omni. 1shot | Omni. 5shot | mImg. 1shot | mImg. 5shot | |
---|---|---|---|---|---|
MAML | 94.49±0.16 | 98.14±0.07 | 48.73±0.64 | 65.70±0.52 | |
Meta-dropout | 96.24±0.14 | 98.81±0.06 | 51.67±0.64 | 68.12±0.53 |
The below figures visualize the learned decision boundaries of MAML and meta-dropout. We can see that the perturbations from meta-dropout generate datapoints that are close to the decision boundaries for the classification task at the test time, which could effectively improve the generalization performance.
We also visualize the stochastic features at lower layers of convolutional neural networks. We can roughly understand how each of the training examples perturbs at the latent feature space.
Omniglot
miniImageNet
Lastly, in the main paper, we also performed experiments on adversarial robustness. Our meta-dropout seems to improve both clean and adversarial robustness. Further, meta-dropout seems to improve robustness over various types of attacks at the same time, such as L1, L2, and Linf.
If you found the provided code useful, please cite our work.
@inproceedings{
lee2020metadrop,
title={Meta Dropout: Learning to Perturb Latent Features for Generalization},
author={Hae Beom Lee and Taewook Nam and Eunho Yang and Sung Ju Hwangg},
booktitle={ICLR},
year={2020}
}