kaskr / adcomp

AD computation with Template Model Builder (TMB)
Other
175 stars 80 forks source link

Optim vs Nlimb #73

Closed MonHansen closed 4 years ago

MonHansen commented 9 years ago

Is there a difference between Optim and Nlminb? I have noticed that Nlminb sometimes can optimise when Optim can't. If this happens are the results of Nlminb trustworthy?

bbolker commented 9 years ago

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 15-01-08 05:35 AM, MonHansen wrote:

Is there a difference between Optim and Nlminb? I have noticed that Nlminb sometimes can optimise when Optim can't. If this happens are the results of Nlminb trustworthy?

--- Reply to this email directly or view it on GitHub: https://github.com/kaskr/adcomp/issues/73

General-purpose continuous optimization is just plain difficult, even when one has a great tool like TMB/adcomp. You can dig into the documentation of each function, or the code, to see exactly what algorithms they implement ...

I would certainly check my results even more carefully to see that they made sense if one optimizer failed, but this does happen very frequently for difficult models. Figuring out exactly why can be very difficult. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUro7vAAoJEOCV5YRblxUHudsH/3CdoOB0rePQ7kZfcrj+WSTh 4wxVd2vZ049V8/OnLUa06+vb3P5Cz4PlMTnIfTDm7ntlkM3rOfJ6ldvgJ904SQ+l YvvThO46I+vvaIbU897s4JxFHhdAV5ZtND9xkLV/QNfVZVklAqNNOQVw5qxjrN/4 /VMLLOVgQ+fQfn90alhId25T+9x8nNcCMXsAL9Rqe0VEM+dXWJDQvt6NdES1QxUW +bfbQYHA8I+8TA7YbAr//UaJEppf5eEXMJ79oR76VgZH3Vplp8XrJjRNnHpfi9Lo p4/wBpLXEkQ484OYLUn5WvEBZLO8ZmfHNfjOLQvWvRSLODj49wv/deSMWjZjTxg= =A3nW -----END PGP SIGNATURE-----

John-R-Wallace-NOAA commented 9 years ago

Optimization wrapper package, optimx, and the trust optimization function

The optimx package is a wrapper function for a large number of optimization functions, read the package help for more information.

Running the TMB's 'simple' example with all.methods=T gives:

require(optimx)
optimx(obj$par, function(x) as.numeric(obj$fn(x)), obj$gr, control = list(all.methods=T))

Not improving much - will try early exit...PD hess?: FALSE

beta beta.1 logsdu logsd0 value fevals gevals niter convcode kkt1 kkt2 xtimes
BFGS 53.2331627 30.753459 2.888532 2.036139 9.120328e+02 115 42 NA 0 TRUE TRUE 1.10
CG 3.9827125 1.288637 3.550401 2.030498 9.825627e+02 302 101 NA 1 FALSE FALSE 2.06
Nelder-Mead 0.7502629 5.166633 3.603549 2.039200 9.902448e+02 183 NA NA 0 FALSE FALSE 0.84
L-BFGS-B 53.2303152 30.751314 2.888707 2.036301 9.120327e+02 44 44 NA 0 TRUE TRUE 0.43
nlm 53.2308187 30.751029 2.888713 2.036309 9.120327e+02 NA NA 50 0 TRUE TRUE 1.75
nlminb 53.2308968 30.751055 2.888714 2.036310 9.120327e+02 31 27 26 0 TRUE TRUE 0.30
spg 53.2297362 30.749904 2.888714 2.036309 9.120327e+02 1514 NA 1201 0 TRUE TRUE 14.89
ucminf 53.2308941 30.751064 2.888714 2.036309 9.120327e+02 30 30 NA 0 TRUE TRUE 0.31
Rcgmin NA NA NA NA 8.988466e+307 NA NA NA 9999 NA NA 0.26
Rvmmin 53.2308922 30.751043 2.888714 2.036309 9.120327e+02 98 42 NA 0 TRUE TRUE 1.58
newuoa 53.2309129 30.751054 2.888714 2.036309 9.120327e+02 1250 NA NA 0 TRUE TRUE 5.99
bobyqa 53.2308987 30.751106 2.888714 2.036309 9.120327e+02 1079 NA NA 0 TRUE TRUE 5.62
nmkb 53.2331585 30.754578 2.888727 2.036300 9.120327e+02 316 NA NA 0 TRUE TRUE 1.47
hjkb 0.1000000 0.100000 0.100000 0.100000 3.909499e+04 1 NA 0 9999 NA NA 0.00

There were 50 or more warnings (use warnings() to see the first 50)

The 'trust' package has a single function 'trust' that "carries out a minimization or maximization of a function using a trust region algorithm".

The trust function requires a hessian, which is not yet provided by TMB for models that contain a random component.

However this can be overcome for comparison purposes to nlimb() by using optimHess() to provide the Hessian:

obj$he <- function(x=obj$env$last.par.best[-obj$env$random], disable.tracing = TRUE, ...) { 

         if(disable.tracing) {
            tracemgc.old <- obj$env$tracemgc; obj$env$tracemgc <- FALSE; on.exit(obj$env$tracemgc <- tracemgc.old)
            inner.trace.old <- obj$env$inner.control$trace; obj$env$inner.control$trace <- FALSE; on.exit(obj$env$inner.control$trace <- inner.trace.old, add = TRUE)
            silent.old <- obj$env$silent; obj$env$silent <- TRUE; on.exit(obj$env$silent <- silent.old, add = TRUE)
         }

         optimHess(x, obj$fn, obj$gr, ...) 
      }

obj$he()
beta beta logsdu logsd0
beta 0.117005092 -0.040568074 -0.009937551
beta -0.040568074 0.116896783 -0.009645631
logsdu -0.009937551 -0.009645631 190.974821042
logsd0 0.009937537 0.009645610 16.527890613

Comparison with all tracing turned off:

obj$env$tracemgc <- FALSE obj$env$inner.control$trace <- FALSE obj$env$silent <- TRUE

system.time(optim(obj$par, obj$fn, obj$gr, control = list(parscale=obj$par*0+1e-1,trace=0), hessian = F)) #optim trace set to zero also
user system elapsed
0.63 0.04 0.68
system.time(opt <- nlminb(obj$par, obj$fn, obj$gr))
user system elapsed
0.31 0.00 0.32
system.time(opt <- nlminb(obj$par, obj$fn, obj$gr, obj$he))
user system elapsed
0.86 0.02 0.87
require(trust)
system.time(opt <- trust(function(x) { list(value = obj$fn(x), gradient = obj$gr(x), hessian = obj$he(x)) }, obj$par, 1, 10))
user system elapsed
0.86 0.00 0.86

Note that in other examples I tried the gradient argument for trust() needed to be changed to a vector:

system.time(opt <- trust(function(x) { list(value = obj$fn(x), gradient = as.vector(obj$gr(x)), hessian = obj$he(x)) }, obj$par, 1, 10))

Numerical Optimization in R: Beyond optim: http://www.jstatsoft.org/v60/i01/paper

There is additional comparison information between optim and nlminb here: http://stats.stackexchange.com/questions/9535/when-should-i-not-use-rs-nlm-function-for-mle