Symbolic Representation of Optimal Control

ChrisRackauckas commented 3 years ago

With ControlSystem we were attempting to build the stack towards doing nonlinear optimal control, but we found that having a separate ControlSystem type is the wrong way to go about doing it because it would need to just recreate everything of ODESystem. So the question instead is, how do we do optimal control using the ODESystem of https://github.com/SciML/ModelingToolkit.jl/pull/1059 where it has potential control variables in there?

The idea is to have transformation functions. The first one would be something like direct_optimal_control(loss,sys,u0,tspan,p,ctrl;fit_parameters = false) which would transform an ODESystem into an OptimizationProblem. It would do so by using the direct method, a la https://diffeqflux.sciml.ai/dev/examples/optimal_control/, basically generating the ODEProblem with fittable parameters https://github.com/SciML/ModelingToolkit.jl/issues/1088 and writing a loss function on the solution. It would be nice for the loss to be written symbolically and generate the full loss, and that might need some improvements to OptimizationProblem. fit_parameters would then flip yes/no on whether the p.p would be simultaneously fit to the control values, and the way the control is represented (neural network, piecewise constant, etc.) would just be from how the user passes a parameterized control function.

Then we can have discretized_optimal_control which would be like the current runge_kutta_discretize which takes the problem and fully discretizes it to build a huge constrained nonlinear optimization problem, where u and ctrl are then indexable from the solution due to being symbolic arrays.

This then opens up indirect methods, which take ODESystem and a symbolic loss and generates a BVP for Pontryagin's maximum principle, etc.

https://github.com/SciML/ModelingToolkit.jl/pull/1059 then makes the forward simulation and playing with the learned controllers trivial.

baggepinnen commented 3 years ago

I'm dropping some thoughts here, MPC is one of the more demanding applications of optimal control, where open-loop optimal-control problems have to be solved repeatedly. This optimization also takes place "in the loop", i.e., the learned controller is continuously used in the simulation. One way I'm envisioning such a controller being used with MTK is as something like the following

struct MPCController <: DiscreteSystem # MPC is inherently discrete in nature
    oc_problem
    state_observer
    input
    output
end

The closed-loop is formed by connecting the MPC controller to the controlled system, something like

  ┌───────┐     ┌───────┐
  │       │     │       │
┌─►  MPC  ├─────►   P   ├─┬───►
│ │       │     │       │ │
│ └───────┘     └───────┘ │
│                         │
└─────────────────────────┘

closed_loop = ODESystem([MPC.input ~ P.output, P.input ~ MPC.output], systems=[MPC, P])

MPC is a DiscreteSystem and at each discrete time point, it updates the state observer and solves the optimal-control problem.

This view admits flexibility in how the OC-problem is specified and solved. The dual to MPC, moving horizon estimation (MHE), should ideally fit equally well into the tooling that is developed for MPC.

On a lower level, both direct_optimal_control and discretized_optimal_control result in some function, either u(t) or u(x, t) where x is the state of, typically, a state observer.

With the proper input/output semantics in place, the open-loop case would be easily simulated by connecting the optimized function u(t) to the input of P. Perhaps this is also how you specify the problem? For example

 ┌───────┐     ┌───────┐
 │       │     │       │
 │  fun  ├─────►   P   │
 │       │     │       │
 └───────┘     └───────┘

open_loop = ODESystem([parameterized_function.output ~ P.input], systems=[parameterized_function, P])
loss = loss_function(open_loop)
optimize(loss, parameterized_function.p)

This would move the problem of specifying tunable control variables from P to parameterized_function, and would also make this problem specification very similar to the closed-loop problem specification. The high-level call to optimize here would do the suggested transformation of the ODESystem into an OptimizationProblem. The important aspect here is that P shouldn't care about whether it's controlled in open or closed loop, hence the tunable parameters are moved out of P into an abstract "controller" that is connected to P. This controller can be either a tunable function or another AbstractSystem.

The case u(x,t) is more akin to the MPC case above where u(x,t) is really an AbstractSystem of itself (instead of parameterized_function), discrete or continuous. Also here one can express this optimization problem not by specifying control parameters of P, but system parameters of the connected controller, i.e. by specifying which parameters of closed_loop are subject to optimization.

baggepinnen commented 3 years ago

Some relevant references

"Optimica—An Extension of Modelica Supporting Dynamic Optimization" https://lucris.lub.lu.se/ws/portalfiles/portal/6062979/8253707.pdf

In this paper, an extension of Modelica, entitled Optimica, is presented. Optimica extends Modelica with language constructs that enable formulation of dynamic optimization problems based on Modelica models.

The Modelica extension they introduce appears to rely on extending model classes in an optimization class, and parameters of the base model class can then be declared free, i.e., to be parameters to be optimized. Something similar is easily achieved by just providing a list of parameters to optimize, similar to how initial conditions are provided. The example they give is

optimization DIMinTime
    DoubleIntegrator di(u(free=true, initialGuess=0.0));
end DIMinTime;

where DoubleIntegrator is of type model and has an input Real u. A key aspect here is that everything related to the optimization is specified outside of the model specification.

Another interesting feature is the access of a variable at a particular time instant. This is used to

Express cost functions involving measurements, e.g., sum(abs2, model.y(t[i]) - y_data[i] for all i) where y_model is a continuous variable in the simulation.
Express final-time constraints, e.g., model.x(t_final) == 1.

Optimal-control problems are specified in continuous (infinite dimensional) time, e.g., as

der(cost) = 1 # cost = ∫1 dt
objective = cost(t_final) # objective is the cost at the final time (final time is an optimization variable)

The discretization of the continuous problem is not part of the Optimica language extension, it's rather seen as options to the solver or "extra information".

"Dynamic Optimization in JModelica.org" https://lup.lub.lu.se/search/publication/561dd097-b14b-403d-97dc-b28852dd545b

They describe the toolchain Modelica/Optimica -> CasADi -> IPOPT

This paper also goes into a bit more detail on the access of simulation variables at discrete time points.

SciML / ModelingToolkit.jl

Symbolic Representation of Optimal Control #1089