prolearner / hypertorch

MIT License
119 stars 16 forks source link

HyperTorch

Lightweight flexible research-oriented package to compute hypergradients in PyTorch.

What is an hypergradient?

Given the following bi-level problem.

bilevel

We call hypergradient the following quantity.

hypergradient

Where:

Quickstart

hyperparameter optimization

See this notebook, where we show how to compute the hypergradient to optimize the regularization parameters of a simple logistic regression model.

meta-learning

examples/iMAML.py shows an implementation of the method described in the paper Meta-learning with implicit gradients. The code uses higher to get stateless version of torch nn.Module-s and torchmeta for meta-dataset loading and minibatching.

equilibrium-models

This notebook shows how to train a simple equilibrium network with "RNN-style" dynamics.

MORE EXAMPLES COMING SOON

Use cases

Hypergadients are useful to perform

Install

Requires python 3 and PyTorch version >= 1.4.

git clone git@github.com:prolearner/hypertorch.git
cd hypertorch
pip install -e .

python setup.py install would also work.

Implmented methods

The main methods for computing hypergradients are in the module hypergrad/hypergradients.py.

All methods require as input:

Iterative differentiation methods:

These methods differentiate through the update dynamics used to solve the inner problem. This allows to optimize the inner solver parameters such as the learning rate and momentum.

Methods in this class are:

Approximate Implicit Differentiation methods:

These methods approximate the hypergradient equation directly by:

Methods in this class are:

If you use this code, please cite our paper

@inproceedings{grazzi2020iteration,
  title={On the Iteration Complexity of Hypergradient Computation},
  author={Grazzi, Riccardo and Franceschi, Luca and Pontil, Massimiliano and Salzo, Saverio},
  journal={Thirty-seventh International Conference on Machine Learning (ICML)},
  year={2020}
}