Add scalable parallel least squares example

Add parallel least squares example, which takes a given vector, x_given, modeled as a discrete function on an nx by ny grid, and solves the problem

\min{i = 0, \ldots, (nx \cdot ny - 1)} f(x) s.t. f(x) = .5 * \sum{i}(x_{i} - xgiven{i})^2 0 <= x_{i} <= 1, i = 0, \ldots, (nx \cdot ny - 1) 0 <= xgiven{i} <= 1, i = 0, \ldots, (nx \cdot ny - 1) is given

A cyclic decomposition is used to distribute the data over first-stage and second-stage scenario blocks, so this example can be used with any number of scenarios, as long as the number of scenarios is less than (nx * ny - 1) (because the first stage is another block). This data decomposition is possible because the Hessian of this problem is the identity matrix.

This combination of test problem and data decomposition can be used to assess the limiting case of maximum parallel decomposition (that is, one unknown per processor) with arbitrarily large problems, thus setting an upper bound on the parallel efficiency for both weak and strong scaling.

Finally, this problem provides a favorable point of comparison to IPOPT, because IPOPT will eventually run out of memory on a single core and is forced to use direct methods. On a mid-2011 13-inch MacBook Air with a 1.7 GHz Intel Core i5 and 4 GB of 1333 MHz DDR3 memory running OS X 10.8.5, IPOPT noticeably slows on this problem between 250k and 360k unknowns. PIPS-NLP should be able to exploit parallelism greatly; the limiting case is one unknown per process. PIPS-NLP could also employ a matrix-free solver (e.g., CG), or even a dummy solver that simply returns a gradient descent direction, since in the Hessian is the identity.

That being said, this problem is artificial. Specialized methods for least-squares problems (e.g., Gauss-Netwon, FISTA, and for this problem, gradient descent) should scale better and perform better in practice. This example is purely to illustrate the potential best-case scaling performance of PIPS-NLP and the advantages relative to a state-of-the-art nonlinear programming solver package like IPOPT, when an NLP has structure compatible with the algorithms in PIPS-NLP.

Thanks, Geoffrey. This sounds a really good test problem.

Nai-Yuan On Jun 17, 2016 2:47 AM, "Geoffrey Oxberry" notifications@github.com wrote:

Add parallel least squares example, which takes a given vector, x_given, modeled as a discrete function on an nx by ny grid, and solves the problem

\min{i = 0, \ldots, (nx \cdot ny - 1)} f(x) s.t. f(x) = .5 * \sum{i}(x_{i} - xgiven{i})^2 0 <= x_{i} <= 1, i = 0, \ldots, (nx \cdot ny - 1) 0 <= xgiven{i} <= 1, i = 0, \ldots, (nx \cdot ny - 1) is given

A cyclic decomposition is used to distribute the data over first-stage and second-stage scenario blocks, so this example can be used with any number of scenarios, as long as the number of scenarios is less than (nx * ny - 1) (because the first stage is another block). This data decomposition is possible because the Hessian of this problem is the identity matrix.

This combination of test problem and data decomposition can be used to assess the limiting case of maximum parallel decomposition (that is, one unknown per processor) with arbitrarily large problems, thus setting an upper bound on the parallel efficiency for both weak and strong scaling.

Finally, this problem provides a favorable point of comparison to IPOPT, because IPOPT will eventually run out of memory on a single core and is forced to use direct methods. On a mid-2011 13-inch MacBook Air with a 1.7 GHz Intel Core i5 and 4 GB of 1333 MHz DDR3 memory running OS X 10.8.5, IPOPT noticeably slows on this problem between 250k and 360k unknowns. PIPS-NLP should be able to exploit parallelism greatly; the limiting case is one unknown per process. PIPS-NLP could also employ a matrix-free solver (e.g., CG), or even a dummy solver that simply returns a gradient descent direction, since in the Hessian is the identity.

That being said, this problem is artificial. Specialized methods for least-squares problems (e.g., Gauss-Netwon, FISTA, and for this problem, gradient descent) should scale better and perform better in practice. This example is purely to illustrate the potential best-case scaling performance of PIPS-NLP and the advantages relative to a state-of-the-art nonlinear programming solver package like IPOPT, when

an NLP has structure compatible with the algorithms in PIPS-NLP.

You can view, comment on, or merge this pull request online at:

https://github.com/Argonne-National-Laboratory/PIPS/pull/28 Commit Summary

Add parallel least squares example

File Changes

M PIPS-NLP/Test/CMakeLists.txt https://github.com/Argonne-National-Laboratory/PIPS/pull/28/files#diff-0 (26)

A PIPS-NLP/Test/parleastsquares.cpp https://github.com/Argonne-National-Laboratory/PIPS/pull/28/files#diff-1 (320)

Patch Links:

https://github.com/Argonne-National-Laboratory/PIPS/pull/28.patch

https://github.com/Argonne-National-Laboratory/PIPS/pull/28.diff

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Argonne-National-Laboratory/PIPS/pull/28, or mute the thread https://github.com/notifications/unsubscribe/AJpqPI4Z2lG_kqLQ2u_rv4gToFVh5Vk1ks5qMkL2gaJpZM4I4E5C .

Argonne-National-Laboratory / PIPS

Add scalable parallel least squares example #28

an NLP has structure compatible with the algorithms in PIPS-NLP.