stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
2.6k stars 370 forks source link

Fix line search to avoid non-finite gradients #3309

Open cpfiffer opened 3 months ago

cpfiffer commented 3 months ago

Related to #3306

Modify the WolfeLineSearch function in src/stan/optimization/bfgs_linesearch.hpp to handle non-finite gradients.


For more details, open the Copilot Workspace session.

cpfiffer commented 3 months ago

Add test for handling non-finite gradients in WolfeLineSearch


For more details, open the Copilot Workspace session.

cpfiffer commented 3 months ago

For what it's worth, this was a vague attempt at solving this problem using Copilot Workspace. It'd be very cool if this was all it took. Close this if it's garbage because I'm not that familiar with stan's internals. If the robot did a good job I'm happy to investigate this more.

bob-carpenter commented 3 months ago

The fix looks OK in that it will do the same thing for a non-finite return now as for an error code. I kicked off the integration testing.

It'd be nice if the test tested all the ways things could fail. The new test is testing a 1 return, but I didn't see how that was being triggered. The easiest is just plugging in three different functions for testing:

  1. one that always returns NaN
  2. one that always returns an infinite value
  3. one that always returns non-finite gradients

All these should then return a 1 from the line search.

cpfiffer commented 3 months ago

Alright, let's give that a try. Apologies in advance, very new to the whole stan toolchain and I'll likely be pretty clumsy.

bob-carpenter commented 3 months ago

Thanks, @cpfiffer. We're happy to help, as our C++ is pretty complicated in a lot of places.

nhuurre commented 3 months ago

The test wolfeLineSearch_nonfinite_gradient fails because the functor linesearch_testfunc_nonfinite has a finite gradient, and thus does not exercise the expected error path. Actually, none of the new tests notice if I undo the fix in this PR.

The single-line fix looks correct but I'm puzzled as to why it is needed. At the previous line, func should be an instance of ModelAdaptor which already checks for non-finite gradient:

https://github.com/stan-dev/stan/blob/01f592383cf29f2483451b9d5b4718666a50b3a8/src/stan/optimization/bfgs.hpp#L351-L360