Open matthewfeickert opened 4 years ago
Something to be on the lookout for while working on this is to make sure the problems with TensorFlow Hessians that @dantrim first reported in Issue #332 are no longer a problem in the v2.X
series of TensorFlow.
Related: https://gist.github.com/jgomezdans/3144636
#!/usr/bin/env python
"""
Some Hessian codes
"""
import numpy as np
from scipy.optimize import approx_fprime
def hessian ( x0, epsilon=1.e-5, linear_approx=False, *args ):
"""
A numerical approximation to the Hessian matrix of cost function at
location x0 (hopefully, the minimum)
"""
# ``calculate_cost_function`` is the cost function implementation
# The next line calculates an approximation to the first
# derivative
f1 = approx_fprime( x0, calculate_cost_function, *args)
# This is a linear approximation. Obviously much more efficient
# if cost function is linear
if linear_approx:
f1 = np.matrix(f1)
return f1.transpose() * f1
# Allocate space for the hessian
n = x0.shape[0]
hessian = np.zeros ( ( n, n ) )
# The next loop fill in the matrix
xx = x0
for j in xrange( n ):
xx0 = xx[j] # Store old value
xx[j] = xx0 + epsilon # Perturb with finite difference
# Recalculate the partial derivatives for this new point
f2 = approx_fprime( x0, calculate_cost_function, *args)
hessian[:, j] = (f2 - f1)/epsilon # scale...
xx[j] = xx0 # Restore initial value of x0
return hessian
Description
In the backends with autodif, the
AutoDiffOptimizerMixin
class that all of them use takes thefunc
(reminder that
func
return both the objective function and the gradient https://github.com/scikit-hep/pyhf/blob/71d0c148a295e42f86300b7fe317aea56187ee34/src/pyhf/optimize/opt_pytorch.py#L50 ) from each backend's implementation ofsetup_minimize
https://github.com/scikit-hep/pyhf/blob/71d0c148a295e42f86300b7fe317aea56187ee34/src/pyhf/optimize/autodiff.py#L30-L32
and then gives it to
scipy.optimize.minimize
https://github.com/scikit-hep/pyhf/blob/71d0c148a295e42f86300b7fe317aea56187ee34/src/pyhf/optimize/autodiff.py#L33-L35
This minimization doesn't take into account the Hessian at the moment. However, as @lukasheinrich notes in Issue #764
So we can add in Hessian information to the optimization by providing a callable
hessian
with something likeThis is probably a good idea, but will require some work to make sure that there is agreement with MINUIT.
This was prompted by a discussion with @dguest (as many interesting ideas are).