trilinos / Trilinos

Primary repository for the Trilinos Project
https://trilinos.org/
Other
1.2k stars 563 forks source link

ROL: adding an inequality constraint #9659

Closed Adrian-Diaz closed 2 years ago

Adrian-Diaz commented 3 years ago

Question

@trilinos/rol

Greetings,

I've been trying to add an inequality constraint to my optimization problem, c(x) < A, but it seems that the bounds I set are not being enforced by ROL since the values of c(x) far exceed the bounds I'm trying to set. The code I used to add the constrain to the problem is the following:

ROL::Ptr<std::vector > li_ptr = ROL::makePtr<std::vector>(1,0.0); ROL::Ptr<std::vector > ll_ptr = ROL::makePtr<std::vector>(1,0.0); ROL::Ptr<std::vector > lu_ptr = ROL::makePtr<std::vector>(1,simparam->maximum_strain_energy);

ROL::Ptr<ROL::Vector > constraint_mul = ROL::makePtr<ROL::StdVector>(li_ptr); ROL::Ptr<ROL::Vector > ll = ROL::makePtr<ROL::StdVector>(ll_ptr); ROL::Ptr<ROL::Vector > lu = ROL::makePtr<ROL::StdVector>(lu_ptr);

ROL::Ptr<ROL::Constraint> ineq_constraint = ROL::makePtr(this, nodal_density_flag, simparam->maximum_strain_energy); ROL::Ptr<ROL::BoundConstraint> constraint_bnd = ROL::makePtr<ROL::Bounds>(ll,lu); problem->addConstraint("Inequality Constraint",ineq_constraint,constraint_mul,constraint_bnd);

Where the constrain value function is as follows:

void value(ROL::Vector &c, const ROL::Vector &z, real_t &tol ) {

ROL::Ptr<std::vector<real_t>> cp = dynamic_cast<ROL::StdVector<real_t>&>(c).getVector();
//communicate ghosts and solve for nodal degrees of freedom as a function of the current design variables
if(last_comm_step!=current_step){
  FEM_->update_and_comm_variables();
  last_comm_step = current_step;
}

ROL_Force = ROL::makePtr<ROL_MV>(FEM_->Global_Nodal_Forces);
ROL_Displacements = ROL::makePtr<ROL_MV>(FEM_->node_displacements_distributed);

real_t current_strain_energy = ROL_Displacements->dot(*ROL_Force);
std::cout << "CURRENT STRAIN ENERGY " << current_strain_energy << std::endl;
(*cp)[0] = current_strain_energy;

}

Thanks for any help rendered.

dridzal commented 2 years ago

Is this a parallel run? If so, is there a serial version exhibiting the same behavior? I should note that ROL is MPI-agnostic, so a discrepancy between the results in the serial and parallel settings could provide a clue to the issue.

Adrian-Diaz commented 2 years ago

This case finished in serial with step tolerance met.

Adrian-Diaz commented 2 years ago

I ran another case in serial and this one got stuck with the following being the last print: image

Adrian-Diaz commented 2 years ago

Is there an example of changing the algorithm that the Solver object uses from augmented lagrangian to something that enforces the constraints more strictly? Preferably through the parameter list.

dridzal commented 2 years ago

@Adrian-Diaz we don't have another type of algorithm that deals with general inequality constraints. They will be penalized, and therefore there will always be some violation, even at the optimal solution. Is your constraint linear or nonlinear? If it is linear, it would be important to get the projection algorithm to work. The projection will enforce the linear constraint very accurately.

As an aside, I'm still concerned that your gradient checks are not passing. You should see a V shape in the error (going down first and then going up as the numerical finite difference error increases). Also, the perturbation vectors should be much larger in magnitude, and they should be random (without violating the physical bounds of your simulation, of course). If you choose an interior feasible point x for the finite difference check, you can use random vector entries with a uniform distribution from an interval around x, which ensures that your simulation will work.

For example, if your optimization variable is density that is typically in the [0, 1] range, you could first generate a random test vector x with entries uniformly distributed in the interval [0.4, 0.6]. Then you could use a random direction vector d with entries uniformly distributed in the interval [-0.1, 0.1]. If 1 is the maximum finite difference step size, this would ensure that the (total) vector sent to the simulator is still within [0, 1], i.e., feasible.

Getting the finite difference checks to pass is the prerequisite to using most ROL algorithms.

Adrian-Diaz commented 2 years ago

In general I'll have to use non-linear constraints such as center of mass calculation. As for the gradients, I'll need to see if the check is actually updating my deformation vector properly (I figured it was since the first step test has to call it twice). Not sure what else could be possible since otherwise the first entry of the check with d = -1e-6 should be the same as the second entry of the check with d=-1e5 judging from the check code.

dridzal commented 2 years ago

Why are you using a vector d with such small entries? The spirit of the check is that it reduces the size of the (total) vector added to x. The vector d should be random with entries that are comparable in size to the entries of x. This should be fairly easy to implement, even if you have physical constraints on x (see my previous comment). Please remind me, are you using ROL's update() methods in your function implementations? You don't have to use them, but if you are, there are some details when it comes to the new (ROL2) interfaces. @dpkouri can help with that.

Adrian-Diaz commented 2 years ago

I tried that because I suspected after several reviews that the check function was just bugged in its default behavior. I printed the design vectors now to make sure. As an example if I feed it d=-0.1 the first call to my value functions has a vector of x=0.9 as expected. Afterwards the vectors are 0.89, 0.889. and 0.8889 rather than 0.99, 0.999, 0.9999 as expected for an attempt to converge the finite difference. That is why only the first step print with a small value of d has a small error in the gradient checks as above.

dridzal commented 2 years ago

Derivative checks are used all the time. I haven't seen bugs yet. Consider this piece of checkGradient code:

  // Temporary vectors.
  Ptr<Vector<Real>> xnew = x.clone();
  for (int i=0; i<numSteps; i++) {
    Real eta = steps[i];
    xnew->set(x);
  ...
  }

The clone operation is supposed to allocate new memory for the xnew vector, based on the original x vector. In other words, this is not a shallow copy. Then, the values of xnew are reset to the values of the original vector x for every finite difference step. In other words, this is not a shallow copy either. It sounds like your implementation uses a shallow copy somewhere. If so, this would be a huge problem for all algorithms.

Adrian-Diaz commented 2 years ago

The value I printed was in the value function as the first statement after getting the vector. I fed the x vector of 1 to the check function and nothing in my code changes the design vector. Has it been thoroughly tested with the TpetraMultivector?

EDIT: Is ROL not deep copying the x-vector I fed it? That would lead to a self referential issue since I was updating that vector as the most current z from the value function at the higher level (simply to avoid defining two different variables). I'm gonna need some clarification on whether, in general, the z vector is using the same views in the initial design multivector I feed to the problem object or if its an entirely separate vector and the z0 vector has to be left alone in all cases.

Adrian-Diaz commented 2 years ago

Yea that was definitely the issue, separating it into a design and test multivector for the FEM code has the gradient error profile making sense now.

dridzal commented 2 years ago

Sounds good. By looking at the const-ness of the arguments in the ROL interfaces, you can easily tell which vectors are changed directly by ROL. This is true of most (likely all) Trilinos solver packages. We do not make a deep copy of the x vector. We operate directly on it. This is by design (and it is good design, for many reasons). Now that you've addressed this key piece, I encourage you to also test the constraints, and reevaluate algorithmic performance. I still think that getting the linearly constrained topology optimization example to run (with projections) would be valuable. It should converge in a handful of iterations, especially if you also have Hessians implemented, see:

http://www.optimization-online.org/DB_FILE/2021/01/8231.pdf

and look for Elastic Topology Optimization, algorithm TRSPG, Table. 3. If you don't have Hessians, LMTR with a zero Hessian should be an excellent option. For parameter list options, see rol/examples/PDE-OPT/published/TRSPG_Kouri2021 (I think I need to merge some recent changes to bring everything up to date).

Adrian-Diaz commented 2 years ago

Thanks for the references. Seems like the polyhedral projection with lin-more is doing something now, however when I print the mass constraint its settling around 40 % rather than the 20% I wanted. EDIT: Apparently I forgot to delete the constrain bound argument from the addLinearConstraint so it was doing something weird that was neither equality or inequality. Checking again. image

dridzal commented 2 years ago

If you formulate the linear constraint as an equality constraint, the polyhedral projection should be nearly exact, and it should be enforced at every iteration (there is no "settling").

Adrian-Diaz commented 2 years ago

No troubles on this front for a while. Thanks for the help.