stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
752 stars 188 forks source link

Solve algebraic equations with Newton solver #1115

Closed charlesm93 closed 5 years ago

charlesm93 commented 5 years ago

Accidently created this as a Stan issue (stan#2735), but it seems it should (for the moment) be math issue. Reposting the issue is. There is however valuable conversation in the linked issue.

Summary:

Provide a new function to solve systems of nonlinear algebraic equations using Newton's method. This complements our current solver, which uses Powell's dogleg method. The Newton solver has been shown to be much faster in several examples.

The name of the function is algebra_solver_newton(). Just like algebra_solver(), it returns the root of a function specified by the user. See the details for the current algebraic solver. It can have the same signature. The more delicate task is exploiting parallelism. Here we need to figure out which additional tuning parameters we will make accessible to the user.

Description:

Newton's solver is fast when applied to convex optimization problems (i.e. finding the root of a convex function's derivative). One key application is finding the mode of a Laplace approximation -- here we want to solve a convex optimization problem with thousands of parameters. With 100 parameters, I got a 10-fold speed up. Other examples have been reported. A discussion can be found on discourse: https://discourse.mc-stan.org/t/newton-solver-in-stan/6684.

There is already a Newton solver, used for ADVI, but it is not exposed to the Stan language. Furthermore, it is not designed to handle large systems. KINSOL, from the CVODES suite, is a promising candidate, which allows for parallelization. See https://computation.llnl.gov/projects/sundials/toms_sundials.pdf. The parallelization happens at a vector operation level. CVODES provides a parallel vector routine. Since we rely on the implicit function theorem for the Jacobian, we do not need to propagate sensitivities through the solver and can use CVODES parallel vector routines.

Beyond that, I'm not clear on how to incorporate this parallelization inside Stan. Can we do something akin to map() and MPI?

Other features of interest: the user can specify a step size, or get an adaptive step size. The latter doesn't always help, since it is costly to find the optimal step size. But, according to KINSOL's documentation, it makes a difference when we are "far" from the solution. This is likely to occur in high dimensions, where, I suspect, we'll be subject to a curse of dimensionality.

Note we would rename the current algebra_solver() algebra_solver_powell(). We need to run tests to better understand which solver works best when. Two important examples to consider: steady states in pharmacometrics, and constructing priors (something @betanalpha told me about).

Reproducible Steps:

See the issues posted on the forum.

Expected Output:

A faster algebraic solver applicable to large systems.

Additional Information:

NA

Current Version:

v2.18.1

syclik commented 5 years ago

A few comments:

  1. It's easy to link to Stan issues. To link to that particular issue, you could just say stan#2735 and GitHub will automatically link the issue for you. That original issue should stay, but as a description of what needs to go in the language.
  2. Overall, there's a lot of good information in the issue, but it's missing enough detail for someone to implement what you're talking about. More specifically:
  3. Summary: it would have been helpful to include the function name here. What are you proposing the new function be called? (I think the markdown was slightly messed up when I received this as email.)
  4. Description: it's missing the description of the function that needs to be implemented in the Math library. What is its function signature? What are the inputs? What are the outputs?

Mind editing the top-level post so it reflects what you're thinking about here?

On Sat, Feb 9, 2019 at 7:12 PM Charles Margossian notifications@github.com wrote:

Accidently created this as a Stan issue (stan-dev/stan#2735 https://github.com/stan-dev/stan/issues/2735), but it seems it should (for the moment) be math issue. Reposting the issue is. There is however valuable conversation in the linked issue. Summary:

Provide a new function to solve systems of nonlinear algebraic equations using Newton's method. This complements our current solver, which uses Powell's dogleg method. The Newton solver has been shown to be much faster in several examples. Description:

Newton's solver is fast when applied to convex optimization problems (i.e. finding the root of a convex function's derivative). One key application is finding the mode of a Laplace approximation -- here we want to solve a convex optimization problem with thousands of parameters. With 100 parameters, I got a 10-fold speed up. Other examples have been reported. A discussion can be found on discourse: https://discourse.mc-stan.org/t/newton-solver-in-stan/6684.

There is already a Newton solver, used for ADVI, but it is not exposed to the Stan language. Furthermore, it is not designed to handle large systems. KINSOL, from the CVODES suite, is a promising candidate, which allows for parallelization. See https://computation.llnl.gov/projects/sundials/toms_sundials.pdf. The parallelization happens at a vector operation level. CVODES provides a parallel vector routine. Since we rely on the implicit function theorem for the Jacobian, we do not need to propagate sensitivities through the solver and can use CVODES parallel vector routines.

Beyond that, I'm not clear on how to incorporate this parallelization inside Stan. Can we do something akin to map() and MPI?

Other features of interest: the user can specify a step size, or get an adaptive step size. The latter doesn't always help, since it is costly to find the optimal step size. But, according to KINSOL's documentation, it makes a difference when we are "far" from the solution. This is likely to occur in high dimensions, where, I suspect, we'll be subject to a curse of dimensionality.

Note we would rename the current algebra_solver() powell_solver(). We need to run tests to better understand which solver works best when. Two important examples to consider: steady states in pharmacometrics, and constructing priors (something @betanalpha https://github.com/betanalpha told me about). Reproducible Steps:

See the issues posted on the forum. Expected Output:

A faster algebraic solver applicable to large systems. Additional Information:

NA Current Version:

v2.18.1

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/math/issues/1115, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZ_F9FnCip7hTOZFzokYfjkbkrU428Sks5vL2PngaJpZM4ayz5m .

rok-cesnovar commented 5 years ago

This one was fixed on the Stan Math level.