RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.29k stars 1.26k forks source link

`SolutionResult` should also contain the optimal cost #3817

Closed hongkai-dai closed 7 years ago

hongkai-dai commented 8 years ago

Currently to get the optimal cost (objective) of an optimization problem, I will need to loop through all costs, and evaluate each costs at the optimal solution. On the other hand, in most solvers (actually all solvers I know), we can get the optimal solution directly from the solver. I suggest to modify the SolutionResult, which is the return argument of MathematicalProgram::Solve() function, such that this is no longer an enum in its current form

enum SolutionResult {
    kSolutionFound = 0,
    kInvalidInput = -1,
    kInfeasibleConstraints = -2,
    kUnknownError = -3,
  };

But instead, it looks like

struct SolutionResult {
  enum status {
    kSolutionFound = 0,
    kInvalidInput = -1,
    kInfeasibleConstraints = -2,
    kUnknownError = -3,
  }
  double objective_value;
}
RussTedrake commented 8 years ago

i agree we definitely need to get that output. the other option is to have a method which can get the most recent objective value (stored by the solver).

worth mentioning the signature from solve() in matlab was function [x,objval,exitflag,infeasible_constraint_name] = solve(obj,x0) https://github.com/RobotLocomotion/drake/blob/master/drake/matlab/solvers/NonlinearProgram.m#L908

in c++, we get x directly from the decisionvariableview. we're getting the exit flag from the current return value. so i think both objval and the infeasible constraint names are missing, and could be either added to the solution result or provided by getters.

hongkai-dai commented 8 years ago

Thank @RussTedrake for the comment. I think infeasible_constraint_name is definitely useful, but we probably can only obtain that information for generic nonlinear solvers. The reason is that for many convex solvers, when the problem is infeasible, it does not return the solution x with the least constraint violation. While in SNOPT, it always returns the solution, so we can evaluate the constraints at that solution, and then find out the infeasible constraints.

siyuanfeng-tri commented 8 years ago

Another related useful feature is to name individual rows of all constraints / costs e.g. A*x = b, and I want to be able to give A and b names. I'd also like to name A(i) and b(i) in addition. This will be handy for debugging large optimization problems. I suppose if we don't name A(i) explicitly, we could just set name to A_index by default.

RussTedrake commented 8 years ago

Agreed. Even if the solver doesn't always return the least infeasible. On Mon, Oct 17, 2016 at 3:46 AM siyuanfeng-tri notifications@github.com wrote:

Another related useful feature is to name individual rows of all constraints / costs e.g. A*x = b, and I want to be able to give A and b names. I'd also like to name A(i) and b(i) in addition. This will be handy for debugging large optimization problems. I suppose if we don't name A(i) explicitly, we could just set name to A_index by default.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/RobotLocomotion/drake/issues/3817#issuecomment-254135495, or mute the thread https://github.com/notifications/unsubscribe-auth/AGJNNEBIScfgdYdXnQPz12qIVc26yyKQks5q0yfCgaJpZM4KYBtX .

sammy-tri commented 8 years ago

i agree we definitely need to get that output. the other option is to have a method which can get themost recent objective value (stored by the solver).

Personally, I think I prefer the approach of adding an accessor/helper to the MathematicalProgram than turning the return type of Solve() into a struct. Since @hongkai-dai's original post showed only a single scalar in the return value, I'm guessing that would be the sum of the values of all costs added to the MathematicalProgram?

Another related useful feature is to name individual rows of all constraints / costs e.g. A*x = b, and I want to be able to give A and b names. I'd also like to name A(i) and b(i) in addition.

IIRC, Eigen includes a number of ways to store references to portions a matrix which reflect updates to the original matrix they reference, allowing programmatic access to data by name. Or is this request more about the ability to print the state of the problem for human inspection?

hongkai-dai commented 8 years ago

@sammy-tri could you explain why you prefer adding an accessor to the MathematicalProgram than changing the return type of Solve? Is it to minimize the change to the current API? If that is the concern, then I think I can handle this API change. And that objective_value can be obtained directly from the solver, within Solve function (many solvers has a subroutine to fetch the objective function directly, like Gurobi, Mosek, etc). Adding an accessor to the MathematicalProgram would mean to loop through the costs and evaluate each cost at the solution, which is not as efficient as fetching the objective value directly from the solver.

I would argue that the return argument of Solve should be the information we can get directly from the solver, such as the status and objective value. The infeasible constraints can be obtained by a separate method. May I suggest the following change to the API?

struct SolutionResult {
  enum status {
    kSolutionFound = 0,
    kInvalidInput = -1,
    kInfeasibleConstraints = -2,
    kUnknownError = -3,
  }
  int solver_code; // This is the return code directly from the solver. For example, for nonlinear solver SNOPT, 13 means "nonlinear constraints are infeasible, and the infeasibility is minimized". Please refer to the documentation of each solver for the meaning of this code.
  double objective_value; // The objective value of the optimization problem. When the optimization fails, this value is NaN.
}

The solver_code would be helpful for debugging.

const std::vector<string> & MathematicalProgram::InfeasibleConstraintNames(const Eigen::MatrixBase<Derived> &x)

that returns the infeasible constraints, for decision variable value x. So if the user wants to find out which constraint is causing the trouble, they can call this method. And x does not have to be the solution of the optimization variable, it can be any vector of the correct size.

sammy-tri commented 8 years ago

And that objective_value can be obtained directly from the solver, within Solve function (many solvers has a subroutine to fetch the objective function directly, like Gurobi, Mosek, etc). Adding an accessor to the MathematicalProgram would mean to loop through the costs and evaluate each cost at the solution, which is not as efficient as fetching the objective value directly from the solver.

I don't believe this is the case. The interface to the individual solvers is through MathematicalProgramSolverInterface::Solve(MathematicalProgram& prog) The solvers, depending on their level of capability, populate information about the solution by calling functions on the prog argument such as SetDecisionVariableValues(), SetSolverResult(), etc. Virtually nothing about what happens during the Solve() call is part of the return value of the inner Solve(), the solver can propagate data back to the MathematicalProgram, this would be one more piece. So a new accessor on MathematicalProgram would not need to loop, it would return the cached data (like functions such as GetSolutionVectorValues() and GetSolverResult() already do).

This is essentially what @RussTedrake proposed (I think) when he said "i agree we definitely need to get that output. the other option is to have a method which can get the most recent objective value (stored by the solver). "

It could be that this is a crappy design. :) It evolved rather than being thought out (at least to some extent). Perhaps we'd actually prefer MathematicalProgramSolverInterface::Solve(const MathematicalProgram& prog, struct SolutionResult* solution) which then populates that struct with the solution data. It lumps all of the information which the solver is potentially expected to provide into one place, which could potentially be cleaner than scattered SetFoo() calls on MathematicalProgram which the solver is expected to use to provide information about the most recently calculated solution.

@sammy-tri could you explain why you prefer adding an accessor to the MathematicalProgram than changing the return type of Solve? Is it to minimize the change to the current API? If that is the concern, then I think I can handle this API change.

My primary concern is to reduce churn on the external interface of MathematicalProgram. Using structs as return values creeps me out, but return value optimization has worked on any decent compiler for years and I probably need to get over it. Also some bad habits from my pre-C++ days.

I would argue that the return argument of Solve should be the information we can get directly from the solver, such as the status and objective value. The infeasible constraints can be obtained by a separate method.

Hmmm. It has the virtue of returning information already known which does not require additional iteration/calculation. It could replace some of the getters (GetSolverResult(), the proposed getter for objective value). It would not replace GetSolutionVectorValues(), the proposed method to get infeasible constraints... Actually, I'm not sure this is cleaner, as now you've got a struct with some of your information, getters for other things, and the actual values of the decision variables stored through the DecisionVariableView you were handed when you added them...

Add name to each constraint. I remember @siyuanfeng-tri already add std::string description for a Constraint object. I would like to add a std::vector names, which has the same length as the number_constraints.

This I don't think I agree with. The constraints aren't even all stored in the same list, they're separated by types. Are you going to std::map that back to a vector of strings? If for some reason the description already in the Constraint isn't adequate, I'd propose adding the name to MathematicalProgram::Binding, not a separate vector. I'm guessing the reason for this is to improve printed status/debug messages?

hongkai-dai commented 8 years ago

@sammy-tri if I understand correctly, you are suggesting that in MathematicalProgram, there should be a data field like objective_value_, which stores the objective of the previous call. And then we add a getter to return that stored objective value.

If that is the case, then I do not think it is the best solution. I think MathematicalProgram should only store information that is intrinsic to the optimization program (like constraints and costs), without depending on anything else. On the other hand, the stored objective value depends on the solver we use (SNOPT and NLopt can return different objective values), and also the initial guess of the solver, especially for a nonconvex nonlinear optimization. Without the context of solver and initial guess, the objective value is meaningless.

On the other hand, the objective value is meaningful when we call the Solve function, since we know the solver and the initial guess. So I still think that it is best to add the objective function in the Solve function. We can discuss whether we want to add it to the return struct, or change the API as

SolutionResult MathematicalProgramSolverInterface::Solve(MathematicalProgram & prog, double *objective_value)

For the constraint name problem, I think I did not make myself clear. I am going to add a data field in Constraint class

private:
  std::vector<string> constraint_names_;

Then after we call the solver, and the problem is infeasible, we can then call MathematicalProgram::InfeasibleConstraintNames, which will then loop through all the constraints, pick out the name of the constraint being violated, and push that name to std::vector<string> infeasible_constraint_names.

sammy-tri commented 8 years ago

I think we're converging on an agreement.

@sammy-tri if I understand correctly, you are suggesting that in MathematicalProgram, there should be a data field like objectivevalue, which stores the objective of the previous call. And then we add a getter to return that stored objective value.

This is what is done now with SetSolverResult() and GetSolverResult(), which store information about the previous Solve() call in solver_name_ and solver_result_.

If that is the case, then I do not think it is the best solution. I think MathematicalProgram should only store information that is intrinsic to the optimization program (like constraints and costs), without depending on anything else.

I think you've convinced me of this, so we should just move solver_name_ and solver_result_ into the same struct with the hypothetical objective_value_ from your (counter-) example above.

We can discuss whether we want to add it to the return struct

If we're going to return a struct (which seems like the direction we're going), let's not bother with the additional output arg, and just add in the objective value, solver name, and solver specific result along with the generic result (the existing enum). So SolutionResult MathematicalProgramSolverInterface::Solve(MathematicalProgram& prog) will still be the signature with SolutionResult a struct.

For the constraint name problem, I think I did not make myself clear. I am going to add a data field in Constraint class

So each instance of Constraint gains a vector of names in addition to it's description? I guess I'm not clear on why it's a vector or how it's different from description_ wyhich already exists in Constraint. Are the names shorter versions of the descriptions?

Then after we call the solver, and the problem is infeasible, we can then call MathematicalProgram::InfeasibleConstraintNames, which will then loop through all the constraints, pick out the name of the constraint being violated, and push that name to std::vector infeasible_constraint_names.

This all sounds right to me, I think... Loop over all the constraints (generic_constraints_, linear_constraints_, etc), evaluate with the appropriate variables, and if the constraint is being violated, query it for the name/description/whatever and append to the result?

siyuanfeng-tri commented 8 years ago

So each instance of Constraint gains a vector of names in addition to it's description? I guess I'm not clear on why it's a vector or how it's different from description_ wyhich already exists in Constraint. Are the names shorter versions of the descriptions?

Imagine this situation, we have a constraint of the form A*x = b, A is 5 by 3, and b is 5 by 1. The current description for is for A and b as a whole, e.g. "Siyuan's eq constraint". Hongkai and I want to name individual rows. We want to name A(1,:) and b(1) to "Siyuan's eq number 1". This is handy if we have a constraint term that have dozens of rows. You could argue that we should split the big constraint into smaller ones. But in practice, we found it often easier and probably more efficient to make the big matrices. The looping through infeasible constraints will also be more meaningful if we know which row of what constraint failed.

hongkai-dai commented 8 years ago

@sammy-tri , I think we converged to the agreement. To make sure we are on the same page, I summarize the proposal as follows:

struct SolutionResult {
enum status {
    kSolutionFound = 0,
    kInvalidInput = -1,
    kInfeasibleConstraints = -2,
    kUnknownError = -3,
  }
  int solver_status_code; // This is the returned code directly from the solver. For example, for nonlinear solver SNOPT, 13 means "nonlinear constraints are infeasible, and the infeasibility is minimized". Please refer to the documentation of each solver for the meaning of this code.
  double objective_value; // The objective value of the optimization problem. When the optimization fails, this value is NaN.
  std::string solver_name; // This should be `snopt`, `mosek`, etc.
}

And the signature is still

SolutionResult MathematicalProgramSolverInterface::Solve(MathematicalProgram& prog)

We might add more fields to SolutionResult, like computation_time.

Thanks @siyuanfeng-tri for the explanation. That is exactly what I mean. Another example is like in inverse kinematics, we can impose a position constraint with 3 rows, on the x, y, z coordinates separately. But when we check the infeasible constraint, it is better to know on which coordinate the constraint is violated.

sammy-tri commented 8 years ago

:facepalm: I forgot Constraint is multiple constraints (of size num_constraints). I think we're on the same page. My apologies for dragging it out so long.

ggould-tri commented 7 years ago

Note: Per the discussion on #6223, drake/systems/trajectory_optimization/direct_collocation.cc is waiting on this PR to be unit testable. Whoever fixes this PR should also add (or add an issue to add) such a unit test.

EricCousineau-TRI commented 7 years ago

I apologize for not updating this issue, but being able to return the optimal solution result (though not directly per the initial design) has been completed via #5685.

I have created issue #6427 to address @ggould-tri's comment.