Fix line search to avoid non-finite gradients

stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.

https://mc-stan.org

BSD 3-Clause "New" or "Revised" License

2.6k stars 370 forks source link

Fix line search to avoid non-finite gradients #3309

Open cpfiffer opened 3 months ago

cpfiffer commented 3 months ago

Related to #3306

Modify the WolfeLineSearch function in src/stan/optimization/bfgs_linesearch.hpp to handle non-finite gradients.

Check if the function value func_val is finite.
Check if the gradient gradx1 is finite.
If either the function value or the gradient is non-finite, restart the line search with a smaller step size.

For more details, open the Copilot Workspace session.

cpfiffer commented 3 months ago

Add test for handling non-finite gradients in WolfeLineSearch

Add linesearch_testfunc_nonfinite class to simulate non-finite gradients
Add wolfeLineSearch_nonfinite_gradient test to verify that the optimization process can handle non-finite gradients
Ensure the test checks that the line search algorithm avoids returning points with finite log density but infinite gradient

For more details, open the Copilot Workspace session.

cpfiffer commented 3 months ago

For what it's worth, this was a vague attempt at solving this problem using Copilot Workspace. It'd be very cool if this was all it took. Close this if it's garbage because I'm not that familiar with stan's internals. If the robot did a good job I'm happy to investigate this more.

bob-carpenter commented 3 months ago

The fix looks OK in that it will do the same thing for a non-finite return now as for an error code. I kicked off the integration testing.

It'd be nice if the test tested all the ways things could fail. The new test is testing a 1 return, but I didn't see how that was being triggered. The easiest is just plugging in three different functions for testing:

one that always returns NaN
one that always returns an infinite value
one that always returns non-finite gradients

All these should then return a 1 from the line search.

cpfiffer commented 3 months ago

Alright, let's give that a try. Apologies in advance, very new to the whole stan toolchain and I'll likely be pretty clumsy.

bob-carpenter commented 3 months ago

Thanks, @cpfiffer. We're happy to help, as our C++ is pretty complicated in a lot of places.

nhuurre commented 3 months ago

The test wolfeLineSearch_nonfinite_gradient fails because the functor linesearch_testfunc_nonfinite has a finite gradient, and thus does not exercise the expected error path. Actually, none of the new tests notice if I undo the fix in this PR.

The single-line fix looks correct but I'm puzzled as to why it is needed. At the previous line, func should be an instance of ModelAdaptor which already checks for non-finite gradient:

https://github.com/stan-dev/stan/blob/01f592383cf29f2483451b9d5b4718666a50b3a8/src/stan/optimization/bfgs.hpp#L351-L360