RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.24k stars 1.25k forks source link

Evaluating constraints in `IpoptSolver_NLP` is always done in series, even when the constraints could be evaluated concurrently. #21724

Open samzapo opened 1 month ago

samzapo commented 1 month ago

Is your feature request related to a problem? Please describe.

The following generic constraint evaluation in the IpoptSolver_NLP class: https://github.com/RobotLocomotion/drake/blob/84f4fc58557043bb73bb3ae2d2748c62b054ec4f/solvers/ipopt_solver_internal.cc#L733-L736 Evaluates each constraint in series, each of these generic constraints is oftentimes expensive to compute. Moreover, in the case of multiple shooting algorithms where these are the Direct Collocation / Direct Transcription constraints, there are many of these non-coupled constraints and could be run concurrently, if implementation permits.

Describe the solution you'd like

I'd like to re-implement the quoted block of code to evaluate some constraints concurrently, if permitted by the constraint, e.g.: https://github.com/samzapo/drake/blob/45fe92d77e87525da86a1f51b319f1973b6612a2/solvers/ipopt_solver_internal.cc#L735-L795

hongkai-dai commented 1 month ago

I would propose not to change IpoptSolver, but instead provide a new constraints that wraps the constraints that can be evaluated in parallel.

class ParallelConstraint {
 public:
  ParallelConstraint(const std::vector<Binding<Constraint>>& constraints);
 private:
  template<typename T, typename S>
  void DoEvalGeneric(const Eigen::Ref<VectorX<T>>& x, VectorX<S>* y) const {
    // Evaluate the constraints in parallel.
  }
};

The advantage of doing it this way is that it gives more control to the user, to adjust which set of constraints should be evaluated in parallel together, and this same parallel evaluation can be used for all solvers (SNOPT/IPOPT/NLOPT).

cc @jwnimmer-tri in case you want to chime in.

jwnimmer-tri commented 1 month ago

I don't think we should make the user group constraints together. They don't actually know very well how to group things in an efficient and correct way. We should just do it for them.

I would say ideally, all choices can be selected independently:

(1) The user builds their MathematicalProgram, adding costs and constraints with no special knowledge of parallelism. For example, nothing in any of the GCSTrajOpt NLP program-building would need to change.

(2) When they call Solve(prog) (with "choose best") or nlopt_solver.Solve(prog) (for a specific solver), they can also pass parallelism=16 to select how many cores to use for solving. Or perhaps more likely, the parallelism= is a CommonSolverOption, instead of a kwarg.

(3) When the solver supports multicore, it automatically selects the NL constraints that support multicore, and evaluates them using a thread pool. (I assume the convex constraints are handled in implicit form by the solver. Anyway, the back-end could do whatever makes the most sense.)

Note that any multicore in solvers will require https://github.com/RobotLocomotion/drake/issues/10320 to be implemented as a prerequisite. We need affirmation that any given constraint is threadsafe, before we run it in multicore.


FYI #19119 is related. That is about using multicore for multiple mathematical programs, broadcasting programs across cores. This one is about using multicore for just one program, broadcasting the constraints (and costs) to multiple cores.

hongkai-dai commented 1 month ago

Thanks Jeremy!

My concern is that our constraint evaluation is not thread safe, they might share some information. For example, we can construct two kinematics constraints that share the same context

PositionConstraint constraint1(plant, frame_A, lower, upper, frame_B, pt, plant_context);
OrientationConstraint constraint2(plant, frame_A, frame_B, tol, plant_context);

if we want to impose both a position and an orientation constraint in an IK problem. Then I don't think we can evaluate these two constraints in parallel as they both would modify the same plant_context.

jwnimmer-tri commented 1 month ago

Right. That's why I said #10320 is a prereq. Any constraint that takes a context pointer without known if its shared, needs to conservatively report that it is not threadsafe.