L-BFGS line search instability ?

PatWie / CppNumericalSolvers

a lightweight header-only C++17 library of numerical optimization methods for nonlinear functions based on Eigen

MIT License

872 stars 201 forks source link

L-BFGS line search instability ? #54

Closed SebDyn closed 7 years ago

SebDyn commented 8 years ago

Hi!

I used the Fortran-code of LBFGSB by Nocedal et al frequently in the past and tried out your code.

My question is if it is possible to implement the version of Nocedal with strong Wolfe-Powell line search conditions. I found your implementation of LBFGS not very robust for most of my optimization problems in large dimensions....

PatWie commented 8 years ago

The More-Thuente should be as robust as the Nocedal version. Sure, when having time this is an option. There is also Hager-Zhang as a line search method. Implementing a line-search method is probably the hardest part when done correctly.

Currently, I do not need this library for active research projects. So I just fix stuff, because I am interested in. This does not mean that it will never implemented in this library. But definitetly not in the next weeks.

SebDyn commented 8 years ago

Ok, I found a small mistake in my implementation which improves stability of the line search a lot...

However, I compared it with Conmin for 3 different problems. I find that your LBFGS implementation converges at lower speed. I need approximately 20% more function evaluations compared with Conmin although Nocedal and Liu report the opposite. I wonder if this is due to a different line search algorithm used here. Can I change some settings to relax the line search criteria a little ?

Further I found that one can not reset the LBFGS matrix during the iteration. I wonder if the convergence rate is somehow related. I will take a closer look into that and send you a patch once I found one...

Best regards Sebastian

PatWie commented 8 years ago

You mean this coinmin? Well as I said, my initial implementations were from scratch (the papers). So they are not necessarily the state-of-the-art versions. I am pretty sure they are not and I probably have neither the time nor the experience to change this very soon. That was the reason to open-source this lib.

At least I am happy, that they converge at all. From my small experience I can say, that each optimizer shines or messes up completely with the used line-search.

There is no LBFGS matrix?! All you can do is skipping the history and restart mostly, which is done in the implementation

Do you have any references of git-repos of the Conmin-solver. Just to have something were I can compare the implementations. Is Conmin really implementing LBFGS? Did you tried other history sizes by changing m?

I used the default parameters here. They wrote, that these work in most cases.

SebDyn commented 8 years ago

Thanks for your reply! I will check out these locations in the code.

Yes, I even meant the original Fortran code of Vaanderplats. And this is quite interesting: Although it is expected to be much worse, it behaves better (in terms of number of function calls) for my test problems (geometric optimization - objective is a 4th order polynomial)

SebDyn commented 8 years ago

You are are right. Some work seems to be in the line search. I changed the tolerance gtol to 1e-1 and now LBFGS behaves at expected regarding number of used designs.

However, the settings of More-Thuente seem not to be optimal. Further, I am not sure if an algorithm ensuring the strong Wolfe-Power conditions is really required here or if a much simpler and faster algorithm for weak Wolfe-Powell conditions would be still safe.... The curvature conditions could be added conditionally to the BFGS update (if not met by a weaker line search).

PatWie commented 8 years ago

The only reason to use/implement MoreThuente here was that the previous line-searches destroyed the performance on the benchmark tests.

What do you propose to do. Armijo + post-checking of strong Wolfe condition?