AntreasAntoniou / HowToTrainYourMAMLPytorch

The original code for the paper "How to train your MAML" along with a replication of the original "Model Agnostic Meta Learning" (MAML) paper in Pytorch.
https://arxiv.org/abs/1810.09502
Other
759 stars 137 forks source link

How to train your MAML in Pytorch

A replication of the paper "How to train your MAML", along with a replication of the original "Model Agnostic Meta Learning" (MAML) paper.

Introduction

Welcome to the code repository of How to train your MAML. This repository includes code for training both MAML and MAML++ models, as well as data providers and the datasets for both. By using this codebase you agree to the terms and conditions in the LICENSE file. If you choose to use the Mini-Imagenet dataset, you must abide by the terms and conditions in the ImageNet LICENSE

Installation

The code uses Pytorch to run, along with many other smaller packages. To take care of everything at once, we recommend using the conda package management library. More specifically, miniconda3, as it is lightweight and fast to install. If you have an existing miniconda3 installation please start at step 3. If you want to install both conda and the required packages, please run:

  1. wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
  2. Go through the installation.
  3. Activate conda
  4. conda create -n meta_learning_pytorch_env python=3.6.
  5. conda activate meta_learning_pytorch_env
  6. At this stage you need to choose which version of pytorch you need by visiting here
  7. Choose and install the pytorch variant of your choice using the conda commands.
  8. Then run bash install.sh

To execute an installation script simply run: bash <installation_file_name>

To activate your conda installations simply run: conda activate

Datasets

We provide the omniglot dataset in the datasets folder directly in this repo. However, due to Mini-ImageNet being substantially larger than github's limit, we chose to upload it on gdrive using pbzip (parallel zip) compression, which is one of the fastest compressions available as we are writing this (really helps if you are zipping something as big as today's large scale datasets). We have automated the unzipping and usage of the dataset, all one needs to do is download it from our mini_imagenet gdrive folder. Once download, please place it in the datasets folder in this repo. The rest will be done automagically when you run your mini-imagenet experiment.

Note: By downloading and using the mini-imagenet datasets, you accept terms and conditions found in imagenet_license.md

Overview of code:

If we pass weights to it, then the layer/model will use those to do inference, otherwise it will use its internal 
parameters. Doing so allows a model like MAML to be build very easily. At the first step, use weights=None and for any
subsequent step just pass the new inner loop/dynamic weights to the network.

- train_maml_system.py: A very minimal script that combines the data provider with a meta learning system and sends them
 to the experiment builder to run an experiment. Also takes care of automated extraction of data if they are not 
 available in a folder structure.

# Running an experiment

To run an experiment from the paper on Omniglot:
1. Activate your conda environment ```conda activate pytorch_meta_learning_env```
2. cd experiment_scripts
3. Find which experiment you want to run.
4. ```bash experiment_script.sh gpu_ids_separated_by_spaces```

Note: By downloading and using the mini-imagenet datasets, you accept terms and conditions found in [imagenet_license.md](https://github.com/AntreasAntoniou/HowToTrainYourMAMLPytorch/blob/master/imagenet_license.md) 

To run an experiment from the paper on Mini-Imagenet:
1. Activate your conda environment ```conda activate pytorch_meta_learning_env```
2. Download the mini_imagenet dataset from the [mini_imagenet gdrive folder](https://drive.google.com/file/d/1qQCoGoEJKUCQkk8roncWH7rhPN7aMfBr/view?usp=sharing)
3. copy the .pbzip file in datasets
4. cd experiment_scripts
5. Find which experiment you want to run.
6. ```bash experiment_script.sh gpu_ids_separated_by_spaces```

To run a custom/new experiment on any dataset:
1. Activate your conda environment ```conda activate pytorch_meta_learning_env```
2. Make sure your data is in datasets/ in a folder structure the data provider can read.
3. cd experiment_template_config
4. Find an experiment close to what you want to do and open its config file.
5. For example let's take an omniglot experiment on maml++. Make changes to the hyperparameters such that your 
experiment takes form. Note that the variables in $<variable>$ are hyperparameters automatically filled by the config
generation script. If you add any new of those, you'll have to change the generate_configs.py file in order to tell it
what to fill those with.
6.
    ```json
    {
      "batch_size":16,
      "image_height":28,
      "image_width":28,
      "image_channels":1,
      "gpu_to_use":0,
      "num_dataprovider_workers":8,
      "max_models_to_save":5,
      "dataset_name":"omniglot_dataset",
      "dataset_path":"omniglot_dataset",
      "reset_stored_paths":false,
      "experiment_name":"MAML++_Omniglot_$num_classes$_way_$samples_per_class$_shot_$train_update_steps$_filter_multi_step_loss_with_max_pooling_seed_$train_seed$",
      "train_seed": $train_seed$, "val_seed": $val_seed$,
      "train_val_test_split": [0.70918052988, 0.03080714725, 0.2606284658],
      "indexes_of_folders_indicating_class": [-3, -2],
      "sets_are_pre_split": false,

      "total_epochs": 150,
      "total_iter_per_epoch":500, "continue_from_epoch": -2,

      "max_pooling": true,
      "per_step_bn_statistics": true,
      "learnable_batch_norm_momentum": false,

      "dropout_rate_value":0.0,
      "min_learning_rate":0.00001,
      "meta_learning_rate":0.001,   "total_epochs_before_pause": 150,
      "task_learning_rate":-1,
      "init_task_learning_rate":0.4,
      "first_order_to_second_order_epoch":80,

      "norm_layer":"batch_norm",
      "cnn_num_filters":64,
      "num_stages":4,
      "number_of_training_steps_per_iter":$train_update_steps$,
      "number_of_evaluation_steps_per_iter":$val_update_steps$,
      "cnn_blocks_per_stage":1,
      "num_classes_per_set":$num_classes$,
      "num_samples_per_class":$samples_per_class$,
      "num_target_samples": $target_samples_per_class$,

      "second_order": true,
      "use_multi_step_loss_optimization":true,
      "use_gdrive":false
    }
  1. cd script_generation_tools
  2. python generate_configs.py; python generate_scripts.py
  3. Your new scripts can be found in the experiment_scripts, ready to be run.

Acknowledgments

Thanks to the University of Edinburgh and EPSRC research council for funding this research.