support for optimization-based ROMs

Pressio / pressio

core C++ library

Other

45 stars 7 forks source link

support for optimization-based ROMs #68

Closed fnrizzi closed 6 months ago

fnrizzi commented 4 years ago

The main objectives are:

enable support in pressio for unconstrained and constrained optimizations algorithms
enable these opt algos to work for ROMs
when specifying constraints, we need to have something as general as possible

Breakdown of tasks:

[x] create package skeleton inside pressio it is called optimizers in line with the solvers package
[x] make list of potential tpls we can use
- [x] write a brief summary of pros/cons of them
- [x] brief summary of the main differences in how they support constraints
- [x] select one to pick as default and write somewhere why this chosen
- [x] verify tpl can be built/linked easily and it is not a headache
- [ ] add support in pressio for this tpl
- [ ] get a generic enough example to build and run: for ROL, we can use the simple-eq one that we were pointed to
[ ] Find best API design to support (must be minimal change wrt to current solver-based API):
- [ ] unconstrained
- [ ] which methods we want to support first?
- [ ] constrained
- [ ] which methods we want to support first?
- [ ] how to enable arbitrary constraints?
[ ] Draft implementation for
- [ ] unconstrained opt for some target algos
- [ ] constrained opt for some target algos
- [ ] create unit tests
- [ ] create regular tests
- [ ] move functionality from experimental into main namespace

Keep in mind that up to here, the optimizer package should know nothing about ROMs. Now we need to use this package for ROMs:

[ ] Add support for running unconstrained LSPG:
- [ ] what is needed?
- [ ] create unit tests
- [ ] create regular tests (recycle/adapt the ones already using Gauss-newton)
- [ ] move functionality from experimental into main namespace
[ ] Add support for running constrained LSPG:
- [ ] what is needed?
- [ ] create unit tests
- [ ] create regular tests
- [ ] move functionality from experimental into main namespace

Random things:

Please work in the optPack branch. (already exists), inside namespace experimental
always prepend your commits with issue number such as:
```
#68: bla bla
```
for unsteady LSPG, we currently use policy classes to compute residual and jacobian that Gauss-newton uses: can we reuse these policy classes to compute objective function and gradient and hessian if the problem is solved with an optimizer rather than a solver?
I think we should pick one or two algorithms to support, we don't have a huge amount of time, and we can add more later.

pjb236 commented 4 years ago

List of TPLS: -NLOPT -OPT++ -ROL

pjb236 commented 4 years ago

NLOpt

For more: https://nlopt.readthedocs.io/en/latest/NLopt_C-plus-plus_Reference/ https://nlopt.readthedocs.io/en/latest/NLopt_Tutorial/

Pros: -supports equality and inequality constraints -nice interface:

// Create Opt object:
nlopt::opt(nlopt::algorithm, unsigned n);

// algorithm= choice of optimization algorithm
// n= number of optimization parameter = ROM dimension

// Objective function:

// To set objective for optimizer:
void nlopt::opt::set_min_objective(nlopt::vfunc f, void* f_data);

// Objective must be of form:
double f(const std::vector`<double>` &x, std::vector`<double>` &grad, void* f_data);

// For LSPG, this returns the L2 norm of the residual

// grad= length n vector that will be set upon return to the gradient. 

// For LSPG: this returns phi^T J^T R, where phi is the basis, J is the Jacobian of R, and R is the residual vector. 

// f_data= pointer to data needed by optimizer, convert within objective, gradient functions by a typecast

// To add constraints:
void nlopt::opt::add_inequality_constraint(nlopt::vfunc fc, void *fc_data, double tol=0);
void nlopt::opt::add_equality_constraint(nlopt::vfunc h, void *h_data, double tol=0);

// There are a number of functions to set stopping criterion

// To perform optimization:

nlopt::result nlopt::opt::optimize(std::vector`<double>` &x, double &opt_f);

// On successful return, x will contain the optimal generalized coordinates for the ROM, and opt_f
// will be the corresponding objective function.

-good online documentation

Cons: -Seems to insist on using std::vector for optimization parameters in objective, gradient, and constraint methods. -uses function pointers for objective, gradients, constraints

Constraint interface: -user-defined functions passed to optimization object using an “add_equality_constraint” or “add_inequality_constraint” method.

void nlopt::opt::add_inequality_constraint(nlopt::vfunc fc, void *fc_data, double tol=0);
void nlopt::opt::add_equality_constraint(nlopt::vfunc h, void *h_data, double tol=0);

pjb236 commented 4 years ago

OPT++

Examples online: https://software.sandia.gov/opt++/

Pros: -supports equality and inequality constraints -Code is supported and maintained by Sandia.

Cons: -Interface is more complicated than NLopt or ROL. -Problem setup uses function pointers to pass objective, gradient and constraint functions to optimization routines. -Online example uses NEWMAT format; other data structures may be supported, but this isn’t stated explicitly. -Active support?

Constraint interface: -user-defined functions passed to constraint object, which in turn is passed to optimization object.

pjb236 commented 4 years ago

ROL

Useful online resources: https://trilinos.github.io/pdfs/ROL.pdf

Pros: -Abstract vector class ROL::vector, with ready-made wrappers for std::vector, Epetra, and Tpetra. -Clean and fairly straightforward problem setup and execution:

Set up linear algebra wrappers (if needed)
Objective function interface: templated class with “value” method that returns objective function value, “gradient” method that overwrites gradient vector, “hessVec” method that overwrites vector with hessian-input vector product. (Optional)
Create an optimization algorithm object, e.g. ROL::Algorithm algo("Line Search",parlist). Many options can be set via a number of methods available for the algorithm object.
Run the optimizer: algo.run(x, obj); where x is ROL::vector object and obj an instance of the objective function class. -Code is supported and maintained by Sandia.

Cons: -no inequality constraint interface: user recommended to use slack variables and bounds (on the slack variables) c(x) >/=0 —> c(x) -s = 0, s >/=0 -ROL vector class uses inheritance, not templates; need to overwrite all virtual functions, which can be tedious

Constraint interface: -Pass in a constraint object with a list of required methods, such as “value”, which returns the constraint violation.

This is the best candidate since it is a Sandia code, uses an abstract class for vectors, and uses classes for the objective and constraint functions.

pjb236 commented 4 years ago

Proposed "system" class - a system object defines the problem to optimize:

class System{
public:
  using scalar_type         = /* obvious */
  using state_type          = pressio::containers::Vector< ... >;
  using gradient_type       = /* same as state_type */
  // specify the type to represent the action of the hessian on a state type
  using apply_hessian_type  = /* same as state_type */

public:
  scalar_type operator()(const state_type & x) const
  {
    // compute and return objective function
    // param_t p allows one pass fixed parameters to each function in an arbitrary data structure
  }

  void gradient(const state_type & x, const gradient_type & g) const
  {
    // compute and return the gradient
  }

  template <typename operand_t>
  void applyHessian(const state_type & x, const operand_t & v, apply_hessian_type & Hv) const
  {
    // compute the product of the Hessian, H, and some (state_type) vector, v
  }
}

Proposed class for EQUALITY constraint:

class EqualityConstraint{
public:
  using scalar_type = 
  using state_type  =
public:
  template<typename param_t, typename constraint_t>
  void operator()(const state_type & x, const param_t p, constraint_t c) const
  {
    // compute and return equality constraint violation(s) c
  }

  template<typename param_t, typename constraint_t>
  void applyEqConstraintJacobian(const state_type & x, const vector_type v, 
                                  const param_t p, constraint_t Cv ) const
  {
    // compute and return equality constraint violation(s) c
  }
}

Proposed class for INEQUALITY constraints:

class InequalityConstraint{
public:
  using scalar_type = 
  using state_type  =
public:
  template<typename param_t, typename constraint_t>
  void operator()(const state_type & x, const param_t p, constraint_t c) const
  {
    // compute and return inequality constraint violation(s) c
  }

  template<typename param_t, typename constraint_t>
  void applyIneqConstraintJacobian(const state_type & x, const vector_type v,
                                       const param_t p, constraint_t Cv ) const
  {
    // compute and return inequality constraint violation(s) c
  }
}

The class for equality and inequality has the same form. Can we just call it Constraint? Or the inequality one need more functionalities in it?

fnrizzi commented 4 years ago

@pjb236

who provides the param_t p? are these originated from the optimizer or the user?
in the applyHessian:
- is v always a vector? so is it safe to assume it is of type state_type?
- or v can be something else?
what is constraint_t?
for equality and inequality, can we use the same class? Or we need to separate?

pjb236 commented 4 years ago

param_T p originates from the user and contains parameters and/or objects we need to evaluate objectives, gradients, etc. I believe we may have to use this argument to pass the app object.
I don't see what v would be other than a vector, so yes, it should be state_type. FYI I originally called it "vector_type" because I was thinking it would be a pressio vector container rather than the "state_type" associated with an app object.
"constraint_t" is the type for the output from the constraint functions. It allows for scalar outputs for a single constraint and vector outputs for multiple constraints. Note that "constraint_t" for the applyConstraintJacobian methods would have to be a vector for a single constraint and a multivector for multiple constraints.
NLopt, opt++, as well as the optimization libraries in matlab and python use different interface functions to set equality and inequality constraints, so I think we need to separate them if we want this to work with different optimization libraires.

fnrizzi commented 4 years ago

param_T p originates from the user and contains parameters and/or objects we need to evaluate objectives, gradients, etc. I believe we may have to use this argument to pass the app object.

We definitely need to separate parameters from the app object. As far as parameters, we need to figure out a way to accept and forward parameters to the backend such that it makes sense. For example, if we rely on ROL and the user specifies parameters using the Trilinos ParameterList, we should be able to forward that to the ROL optimization backend that we (will) have in pressio. This is because we need to keep in mind that the pressio optimizer package only serves as kind of a frontend communicating to several backends.

I don't see what v would be other than a vector, so yes, it should be state_type. FYI I originally called it "vector_type" because I was thinking it would be a pressio vector container rather than the "state_type" associated with an app object.

Ok. I agree. But: vector_type implies that this type is a vector, which is true but what if later on we want to optimize something where the state is a multivector? We want to keep it general. That is why we should name things accordingly: state_type is the type of the optimization state. We can even call it optimization_state to be more expressive. I agree that state_type has to be a pressio container.

"constraint_t" is the type for the output from the constraint functions. It allows for scalar outputs for a single constraint and vector outputs for multiple constraints. Note that "constraint_t" for the applyConstraintJacobian methods would have to be a vector for a single constraint and a multivector for multiple constraints.

Ok so we need to change the naming convention to be more informative. I also would need to see an example to see different types and things one can get. Let's talk about this. We should call it something like constraint_result_t or something else, maybe along these lines:

class Constraint{
public:
  using scalar_type = 
  using state_type  =
  using result_t = /* ... */
public:
  template<typename param_t, typename result_t>
  void operator()(const state_type & x, const param_t & p, result_t & c) const
  {
    // ...
  }

  template<typename param_t, typename apply_jacobian_result_t>
  void applyEqConstraintJacobian(const state_type & x, 
                                 const state_type & v, 
                                 const param_t & p, 
                                 result_t & Cv ) const
  {
    // ....
  }
}

NLopt, opt++, as well as the optimization libraries in matlab and python use different interface functions to set equality and inequality constraints, so I think we need to separate them if we want this to work with different optimization libraires.

ok, to keep in mind. But, again, how the user interacts with pressio is not the same as how pressio interacts with a target backend. Even if we keep a single class for specifying a constraint, we can then see if we can do magic behind the scenes to make it compatible to target

fnrizzi commented 4 years ago

The most basic one is:

class System{
public:
  using scalar_type         = /* obvious */
  using state_type          = pressio::containers::Vector< ... >;

public:
  scalar_type operator()(const state_type & x) const{
    // compute and return objective function
    // param_t p allows one pass fixed parameters to each function in an arbitrary data structure
  }
}

One step forward is to have:

class System{
public:
   // same as above 
  using gradient_type       = /* same as state_type */
public:
  scalar_type operator()(const state_type & x) const;
  void gradient(const state_type & x, const gradient_type & g) const
  {
    // compute and return the gradient
  }
}

One step more:

class System{
public:
   // all above
  // specify the type to represent the action of the hessian on a state type
  using apply_hessian_type  = /* same as state_type */
public:
  scalar_type operator()(const state_type & x) const
  void gradient(const state_type & x, const gradient_type & g) const

  template <typename operand_t>
  void applyHessian(const state_type & x, const operand_t & v, apply_hessian_type & Hv) const
  {
    // compute the product of the Hessian, H, and some (state_type) vector, v
  }
}

pjb236 commented 4 years ago

Here is a draft of a generic optimization parameter class for pressio

template <typename scalar_type>
class OptParameters{ 

  private:
  scalar_type gradientOptimalityTol_   = /* default value */;
  scalar_type stepOptimalityTol_       = /* default value */;
  scalar_type constraintOptimalityTol_ = /* default value */;
  int maxIters_                        = /* default value */;
  int metricsVerbosity_          = /* default value */;

  public:
   void setGradientOptimalityTolerance(scalar_type newTol){
     /* minimum objective gradient magnitude at which to terminate */
     gradientOptimalityTol_ = newTol;
   }

   void setObjFunctionChangeOptimalityTolerance(scalar_type newTol){
      /* minimum CHANGE in objective function at which to terminate*/
      stepOptimalityTol_ = newTol;
   }

   void setConstraintOptimalityTolerance(scalar_type newTol){
      //  minimum constraint violation magnitude at which to terminate 
      // Conditional: on the gradient magnitude also below its tolerance. 
      contraintOptimalityTol_ = newTol;
   }

   void setMaxIterations(int maxIter){
     // maximum number of optimization iterations.
     maxIters_ = maxIter;
   }

   // getters corresponding to above setters
   // -----------

  // Optimization sub problem parameters
  ...

  // Diagnostics

  void setInfoVerbosity(int value){
    /* various levels of verbosity 
        if value=0: print nothing 
        if value=1: print only obj function value 
        if value=2: print obj function and gradient value 
        etc. */
     metricsVerbosity_ = value;
  }
}

Additionally, we will have a function to convert our parameter class to something tpl-specific, for example:

void convertToRolParameterList( const OptParameters & params, ROL::ParameterList & rolParamObj){
    // ... 
   rolParamObj.sublist("Status Test").set("Gradient Tolerance",params.getGradientOptimalityTolerance()); 
   rolParamObj.sublist("Status Test").set("Step Tolerance",params.getStepOptimalityTolerance()); 
   rolParamObj.sublist("Status Test").set("Constraint Tolerance",params.getConstraintOptimalityTolerance()); 
   rolParamObj.sublist("Status Test").set("Iteration Limit",params.getOptimizationIterationLimit()); 
}

fnrizzi commented 4 years ago

@pjb236 what are "Optimization sub problem parameters" ?

pjb236 commented 4 years ago

This is what ROL refers to what I think are convergence tolerances for computing objectives, gradients, etc. at each optimization step. I'm currently trying to think of a better name for this.

fnrizzi commented 4 years ago

This is what ROL refers to what I think are convergence tolerances for computing objectives, gradients, etc. at each optimization step. I'm currently trying to think of a better name for this.

hi! I don't understand what "convergence tolerances for computing objectives, gradients, etc. at each optimization step" means... do you want to talk about this one of these days once I recover my voice? :)

fnrizzi commented 4 years ago

Just so that I don't delete it by mistake:


class Constraint{
public:
  using scalar_type = same as for system
  using state_type  = same as for system
  constexpr int dimensionality = 1;

  Constraint(appObj)

public:
  // detect this if dimensionality == 1, otherwise detect that c has state_type
  void operator()(const state_type & x, c_type & c) const
  {
    // this is void because we want to output single or multiple constraint violations
    // ...
  }

  // detect Cv to be scalaer if dimensionality == 1, otherwise detect that Cv has state_type
  void applyJacobian(const state_type & x, 
                     const state_type & v, 
                     cv_type & Cv ) const
  {
    // ....
  }
}

int main{
  // create system object (check)

  // create constraints 
  Constraint co1();
  Constraint co2();

  OptParameters optPar();

  // create pressio opt problem 
  pressio::opt::Problem<fgfgf> OptProblem(systemObj, co1, co2, ...);

  // solve pressio opt problem 
  OptProblem.solve(x);
}

{
  // define constraints
  co1, co2

  // define LSPG type
  using ode_tag  = pressio::ode::implicitmethods::Euler;
  using lspg_problem = pressio::rom::LSPGUnsteadyProblem<
    pressio::rom::DefaultLSPGUnsteady, ode_tag, fom_t, lspg_state_t, decoder_t>;
  using lspg_stepper_t = typename lspg_problem::lspg_stepper_t;
  lspg_problem lspgProblem(appobj, yRef, decoderObj, yROM, t0);

  OptParameters optPar();
  optimizer_t optimizer(lspgProblem.getSystemRef(), yROM, optPar, co1, co2);

  // integrate in time
  pressio::rom::solveSequence(lspgProblem.getSystemRef(), yROM, 0.0, dt, 10, optimizer);
}

fnrizzi commented 6 months ago

closing this since it is outdated will revise if needed