admb-project / admb

AD Model Builder
http://admb-project.org
Other
64 stars 19 forks source link

Add new feature to take Newton steps with Hessian information (-hess_step) #169

Closed Cole-Monnahan-NOAA closed 3 years ago

Cole-Monnahan-NOAA commented 3 years ago

Per the discussion at the issue https://github.com/admb-project/admb/issues/160 I have added this capability, documented it (function hess_step() in hmc_functions.cpp), and added a test (tests/hess_step). Below is the documentation from the function.

Experimental feature to take Newton steps using the inverse Hessian to get much closer to the optimum and reduce the maximum gradient arbitrarily close to 0.

Let x be the current MLE parameter vector. Then a single step consists of calculating x'=x-inv(Hessian)*gradient

This calculation is done in the unbounded parameter space.

This feature is initiated by calling "-hess_step N -hess_step_tol eps" to specify the maximum number of steps (N) and a minimum threshold for the maximum gradient (eps), below which is deemed sufficient and causes the function to exit early with success. The defaults are N=1 and eps=1e-12. The function will also exit early if the gradients get worse as a result of a step, printing information about which parameters. If successful, the new MLE is deemed improved and is propagated through the model to updated all output files.

The upside of this feature is it confirms that the geometry near the mode is quadratic and well represented by the Hessian. It may also slightly improve the MLE and uncertainty estimates. The downside is that the Hessian needs to be recalculated and inverted at each step so it is costly computationally.

Typical usage is to optimize model, then use this feature if convergence is suspect.

codecov[bot] commented 3 years ago

Codecov Report

Merging #169 (8888fe3) into master (d87b7d1) will increase coverage by 0.14%. The diff coverage is 82.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #169      +/-   ##
==========================================
+ Coverage   47.99%   48.13%   +0.14%     
==========================================
  Files         729      729              
  Lines       61608    61702      +94     
==========================================
+ Hits        29571    29703     +132     
+ Misses      32037    31999      -38     
Impacted Files Coverage Δ
src/nh99/admodel.h 0.00% <ø> (ø)
src/nh99/hmc_functions.cpp 85.68% <78.72%> (+0.51%) :arrow_up:
src/nh99/model7.cpp 81.74% <100.00%> (-2.68%) :arrow_down:
src/nh99/modspmin.cpp 63.38% <100.00%> (+2.61%) :arrow_up:
src/linad99/cbetacf.cpp 66.66% <0.00%> (-33.34%) :arrow_down:
src/nh99/model41.cpp 69.23% <0.00%> (-30.77%) :arrow_down:
src/linad99/d5arr3.cpp 71.42% <0.00%> (-28.58%) :arrow_down:
src/linad99/f4arr3.cpp 71.42% <0.00%> (-28.58%) :arrow_down:
src/linad99/f6arr2.cpp 71.42% <0.00%> (-28.58%) :arrow_down:
src/linad99/f7arr2.cpp 71.42% <0.00%> (-28.58%) :arrow_down:
... and 278 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d87b7d1...8888fe3. Read the comment docs.

Cole-Monnahan-NOAA commented 3 years ago

@johnoel it looks like this passed so that's good. Do you see any issues?

One thing I'd like to update is on this line:

https://github.com/Cole-Monnahan-NOAA/admb/blob/add_hess_step/src/nh99/hmc_functions.cpp#L164

It loops over the active parameters and if the gradient is worse it prints the parameter number and the gradients. I think it would be better to print the parameter names. I.e., instead of cout << jj it would be cout << parm_labels[jj]. This needs to work with vectors so console output would be something like 'a[2]' instead of just 'a' if jj happens to be the second element of 'a'.

Could you advise how to get the parameter labels in this context?