Closed nickresnick closed 5 years ago
I don't think we ever worked out all the kinks in the L-BFGS-B implementation :( failing time to actually fix it, i'd recommend just throwing out the L-BFGS-B state and starting from the most recent iterate. I know that's not exactly satisfying
A possible cause is that the Wolfe line search implemented is perhaps not the most robust w.r.t finite precision. That's essentially what the algorithm's authors, Wright and Nocedal imply on p. 61 of their "Numerical Optimization" (1999). They do recommend the More and Thuente line search, which is suitable for L-BFGS-B. I don't know of a Scala or Java implementation, but there is one for julia https://github.com/JuliaNLSolvers/LineSearches.jl/blob/master/src/morethuente.jl and Matlab http://www.cs.umd.edu/~oleary/software/ (cvsrch.m an cstep.m). There's also a Hager Zhang line search, which is very robust and efficient, but afaik it only works for unconstrained optimisation.
This isn't easy to reproduce, but I'm running L-BFGS-B and at this step in CubicLineSearch#interp
d2
becomes NaN becausel.dd * r.dd
>d1 * d1
. This only happens when fitting on certain datasets. Could you explain what this means in the context of the cubic interpolation? Any suggestions on how to safeguard against this?