rfeinman / pytorch-minimize

Newton and Quasi-Newton optimization with PyTorch
https://pytorch-minimize.readthedocs.io
MIT License
308 stars 34 forks source link
minimization newton-method optimization pytorch quasi-newton-method

PyTorch Minimize

For the most up-to-date information on pytorch-minimize, see the docs site: pytorch-minimize.readthedocs.io

Pytorch-minimize represents a collection of utilities for minimizing multivariate functions in PyTorch. It is inspired heavily by SciPy's optimize module and MATLAB's Optimization Toolbox. Unlike SciPy and MATLAB, which use numerical approximations of function derivatives, pytorch-minimize uses real first- and second-order derivatives, computed seamlessly behind the scenes with autograd. Both CPU and CUDA are supported.

Author: Reuben Feinman

At a glance:

import torch
from torchmin import minimize

def rosen(x):
    return torch.sum(100*(x[..., 1:] - x[..., :-1]**2)**2 
                     + (1 - x[..., :-1])**2)

# initial point
x0 = torch.tensor([1., 8.])

# Select from the following methods:
#  ['bfgs', 'l-bfgs', 'cg', 'newton-cg', 'newton-exact', 
#   'trust-ncg', 'trust-krylov', 'trust-exact', 'dogleg']

# BFGS
result = minimize(rosen, x0, method='bfgs')

# Newton Conjugate Gradient
result = minimize(rosen, x0, method='newton-cg')

# Newton Exact
result = minimize(rosen, x0, method='newton-exact')

Solvers: BFGS, L-BFGS, Conjugate Gradient (CG), Newton Conjugate Gradient (NCG), Newton Exact, Dogleg, Trust-Region Exact, Trust-Region NCG, Trust-Region GLTR (Krylov)

Examples: See the Rosenbrock minimization notebook for a demonstration of function minimization with a handful of different algorithms.

Install with pip:

pip install pytorch-minimize

Install from source:

git clone https://github.com/rfeinman/pytorch-minimize.git
cd pytorch-minimize
pip install -e .

Motivation

Although PyTorch offers many routines for stochastic optimization, utilities for deterministic optimization are scarce; only L-BFGS is included in the optim package, and it's modified for mini-batch training.

MATLAB and SciPy are industry standards for deterministic optimization. These libraries have a comprehensive set of routines; however, automatic differentiation is not supported.* Therefore, the user must provide explicit 1st- and 2nd-order gradients (if they are known) or use finite-difference approximations.

The motivation for pytorch-minimize is to offer a set of tools for deterministic optimization with automatic gradients and GPU acceleration.

__

*MATLAB offers minimal autograd support via the Deep Learning Toolbox, but the integration is not seamless: data must be converted to "dlarray" structures, and only a subset of functions are supported. Furthermore, derivatives must still be constructed and provided as function handles. Pytorch-minimize uses autograd to compute derivatives behind the scenes, so all you provide is an objective function.

Library

The pytorch-minimize library includes solvers for general-purpose function minimization (unconstrained & constrained), as well as for nonlinear least squares problems.

1. Unconstrained Minimizers

The following solvers are available for unconstrained minimization:

To access the unconstrained minimizer interface, use the following import statement:

from torchmin import minimize

Use the argument method to specify which of the afformentioned solvers should be applied.

2. Constrained Minimizers

The following solvers are available for constrained minimization:

To access the constrained minimizer interface, use the following import statement:

from torchmin import minimize_constr

3. Nonlinear Least Squares

The library also includes specialized solvers for nonlinear least squares problems. These solvers revolve around the Gauss-Newton method, a modification of Newton's method tailored to the lstsq setting. The least squares interface can be imported as follows:

from torchmin import least_squares

The least_squares function is heavily motivated by scipy's optimize.least_squares. Much of the scipy code was borrowed directly (all rights reserved) and ported from numpy to torch. Rather than have the user provide a jacobian function, in the new interface, jacobian-vector products are computed behind the scenes with autograd. At the moment, only the Trust Region Reflective ("trf") method is implemented, and bounds are not yet supported.

Examples

The Rosenbrock minimization tutorial demonstrates how to use pytorch-minimize to find the minimum of a scalar-valued function of multiple variables using various optimization strategies.

In addition, the SciPy benchmark provides a comparison of pytorch-minimize solvers to their analogous solvers from the scipy.optimize library. For those transitioning from scipy, this script will help get a feel for the design of the current library. Unlike scipy, jacobian and hessian functions need not be provided to pytorch-minimize solvers, and numerical approximations are never used.

For constrained optimization, the adversarial examples tutorial demonstrates how to use the trust-region constrained routine to generate an optimal adversarial perturbation given a constraint on the perturbation norm.

Optimizer API

As an alternative to the functional API, pytorch-minimize also includes an "optimizer" API based on the torch.optim.Optimizer class. To access the optimizer class, import as follows:

from torchmin import Minimizer

Citing this work

If you use pytorch-minimize for academic research, you may cite the library as follows:

@misc{Feinman2021,
  author = {Feinman, Reuben},
  title = {Pytorch-minimize: a library for numerical optimization with autograd},
  publisher = {GitHub},
  year = {2021},
  url = {https://github.com/rfeinman/pytorch-minimize},
}