MPC purely on a NN model

baryn commented 5 months ago

Hi Tim,

If you allow me, I would like to have your advice on the following idea: I have been trying to implement a MPC based not on a "hybrid" mathematical model: namely, a residual NN model alongside an analytic one (as it is done in the examples in this repo and in the recent literature) but solely on a pure NN model. This model would have been trained in order to "replicate" the replaced analytic model f_dot = f(x,u). I hope it is clear.

During my simulation tests, I assume I inject this well-trained NN (very small state deviation from the expected state over hundreds of iterations) in the MPC (that uses, let's say, a prediction horizon of length 30 only). However it appears, the MPC is unable to find proper solutions. The behavior is chaotic and the dynamics does not seem to be correlated anymore (the optimal inputs found by the MPC do not affect the state as it should over time). The model the NN is trained against is a typical bicycle model. I got good results with a NN trained against a double integrator model though.

Therefore, I wondered what could be the reasons the MPC fails with a NN that aims to replicate a more complex and nonlinear model. Could it be due to inaccuracies in the pytorch gradients ? Sensitivity or numerical issues ? Do you agree that if we could replace an analytical model by a "perfect" NN the MPC would behave exactly the same (with the same tuning, etc) ?

Again, it is just to have your opinion on this approach and if l4casadi suited for that. I agree with the general statement that people may prefer using analytical models that are accurate and "fast" enough and replace only complex effects description (such as aerodynamics efforts) with NN.

Regards, Bryan

Tim-Salzmann commented 5 months ago

Hi Bryan,

Is this using Real-Time L4CasADi + Acados or L4CasADi without approximations with CasADi?

Could you maybe post a minimal non-working example?

Thanks Tim

baryn commented 5 months ago

So, I have tried both methods without success on my bicycle mpc but I made the test with the double integrator working properly only with the l4casadi without approximations. I'll provide an illustrative exemple after the weekend :)

Bryan

baryn commented 5 months ago

Hello Tim,

Here's my test code for the double NN integrator model, it works well with a full NN model but I can't make it work with a bicycle model. Moreover, my syntax below is probably incorrect to run the real time mode with the double integrator, the MPC is completely lost.

import casadi as cs
import numpy as np
import torch
import os
import l4casadi as l4c
from acados_template import AcadosOcpSolver, AcadosOcp, AcadosModel
import time
import scipy.linalg
from FCNetwork import MyNetwork

os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'

# this option is used to switch to the real time mode
USE_RT = False

COST = 'LINEAR_LS'  # NONLINEAR_LS

class DoubleIntegratorWithLearnedDynamics:
    def __init__(self, learned_dyn):
        self.learned_dyn = learned_dyn

    def model(self):
        sx = cs.MX.sym('sx', 1)
        sx_dot = cs.MX.sym('sx_dot', 1)
        sx_dot_dot = cs.MX.sym('sx_dot_dot', 1)
        ax = cs.MX.sym('ax', 1)

        sy = cs.MX.sym('sy', 1)
        sy_dot = cs.MX.sym('sy_dot', 1)
        sy_dot_dot = cs.MX.sym('sy_dot_dot', 1)
        ay = cs.MX.sym('ay', 1)
        x = cs.vertcat(sx, sy, sx_dot, sy_dot)
        x_dot = cs.vertcat(sx_dot, sy_dot, sx_dot_dot, sy_dot_dot)

        u = cs.vertcat(ax,ay)

        if USE_RT:
            p = self.learned_dyn.get_sym_params()
            parameter_values = self.learned_dyn.get_params(np.array([0, 0, 0, 0, 0, 0])) # network input dim

        # the model is a regular single layer of 32 neurons that replicates the analytic model based on the current state and inputs 
        f_expl = self.learned_dyn(cs.vertcat(u,x))

        x_start = np.zeros((4, ))

        # store to struct
        model = cs.types.SimpleNamespace()
        model.x = x
        model.xdot = x_dot
        model.u = u
        model.z = cs.vertcat([])
        model.p = cs.vertcat([])
        if USE_RT:
            model.p = p
            model.parameter_values = parameter_values
        model.f_expl = f_expl
        model.x_start = x_start
        model.constraints = cs.vertcat([])
        model.name = "wr"

        return model

class MPC:
    def __init__(self, model, N, external_shared_lib_dir = "", external_shared_lib_name = ""):
        self.N = N
        self.model = model
        self.external_shared_lib_dir = external_shared_lib_dir
        self.external_shared_lib_name = external_shared_lib_name

    @property
    def solver(self):
        return AcadosOcpSolver(self.ocp())

    def ocp(self):
        model = self.model

        t_horizon = 4.
        N = self.N

        # Get model
        model_ac = self.acados_model(model=model)
        model_ac.p = model.p

        # Dimensions
        nx = model.x.size()[0]
        nu = model.u.size()[0]
        ny = nx + nu

        # Create OCP object to formulate the optimization
        ocp = AcadosOcp()
        ocp.model = model_ac
        ocp.dims.N = N
        ocp.dims.nx = nx
        ocp.dims.nu = nu
        ocp.dims.ny = ny
        ocp.solver_options.tf = t_horizon

        Q = np.diag([1e1, 1e1, 0, 0])
        R = np.diag([0.1, 0.1])
        Qe = Q
        if COST == 'LINEAR_LS':
            # Initialize cost function
            ocp.cost.cost_type = 'LINEAR_LS'
            ocp.cost.cost_type_e = 'LINEAR_LS'

            unscale = N / t_horizon
            ocp.cost.W = unscale * scipy.linalg.block_diag(Q, R) #

            Vx = np.zeros((ny, nx))
            Vx[:nx, :nx] = np.eye(nx)
            ocp.cost.Vx = Vx
            Vu = np.zeros((ny, nu))
            # set identity on inputs indices
            Vu[-nu:, -nu:] = np.eye(nu)
            ocp.cost.Vu = Vu
            Vx_e = np.eye(nx)
            ocp.cost.Vx_e = Vx_e

            l4c_y_expr = None

        ocp.cost.W_e = Qe / unscale
        ocp.cost.yref_e = np.array([0.,0.,0.,0.])

        # Initial reference trajectory (will be overwritten)
        ocp.cost.yref = np.zeros(nx+nu)

        # Initial state (will be overwritten)
        ocp.constraints.x0 = model.x_start

        # Set constraints
        a_max = 1
        ocp.constraints.lbu = np.array([-a_max,-a_max])
        ocp.constraints.ubu = np.array([a_max,a_max])
        ocp.constraints.idxbu = np.array([0,1])

        # Solver options
        ocp.solver_options.qp_solver = 'FULL_CONDENSING_HPIPM'
        ocp.solver_options.hessian_approx = 'GAUSS_NEWTON'
        ocp.solver_options.integrator_type = 'ERK'
        ocp.solver_options.nlp_solver_type = 'SQP_RTI'
        ocp.solver_options.model_external_shared_lib_dir = self.external_shared_lib_dir
        if COST == 'LINEAR_LS':
            ocp.solver_options.model_external_shared_lib_name = self.external_shared_lib_name
        else:
            ocp.solver_options.model_external_shared_lib_name = self.external_shared_lib_name + ' -l' + l4c_y_expr.name

        if USE_RT:
            ocp.parameter_values = model.parameter_values
        return ocp

    def acados_model(self, model):
        model_ac = AcadosModel()
        model_ac.f_impl_expr = model.xdot - model.f_expl
        model_ac.f_expl_expr = model.f_expl
        model_ac.x = model.x
        model_ac.xdot = model.xdot
        model_ac.u = model.u
        model_ac.name = model.name
        return model_ac

def run():
    N = 15

    model = MyNetwork()
    model.to("cpu")
    model.load_state_dict(torch.load("double_integrator_NN.pth", map_location=torch.device("cpu")))
    model.eval()

    if USE_RT:
        learned_dyn_model = l4c.realtime.RealTimeL4CasADi(model, approximation_order=2)
    else:
        learned_dyn_model = l4c.L4CasADi(model, model_expects_batch_dim=True, name='learned_dyn')

    model = DoubleIntegratorWithLearnedDynamics(learned_dyn_model)
    if USE_RT:
        solver = MPC(model=model.model(), N=N).solver
    else:
        solver = MPC(model=model.model(), N=N,
           external_shared_lib_dir=learned_dyn_model.shared_lib_dir,
           external_shared_lib_name=learned_dyn_model.name).solver

    # initial state
    xt = np.array([0., 0., 0., 0.])
    opt_times = []

    # goal state
    yref = np.array([1.,1.,0.,0.,0.,0.])
    for j in range(N):
        solver.set(j, "yref", yref)

    # closed loop simulation
    for _ in range(500):

        solver.set(0, "lbx", xt)
        solver.set(0, "ubx", xt)
        solver.solve()

        xt = solver.get(1, "x")

    print(f'Mean iteration time: {1000*np.mean(opt_times):.1f}ms -- {1/np.mean(opt_times):.0f}Hz)')

if __name__ == '__main__':
    run()

Tim-Salzmann commented 5 months ago

Hi Bryan,

This is hard for me to follow as it is not a minimal example (e.g. I can not run the code as I do not have the MyNetwork code).

For example - if your MyNetwork has Relu activations or other discontinuities this will obviously make it super hard for the optimization to converge.

Please provide the minimal code of a not working example (including all code and possibly trained checkpoints).

Thanks Tim

baryn commented 5 months ago

Tim,

Yes, of course I didn't put all of it to focus essentially on the synthax. I thought about the Relu derivative discontinuity therefore I used SiLU instead. Other discontinuities may be present elsewhere that's what I suspect too. In any case I'll try to investigate. You must be busy too, I am already satisfied having you sharing your feeling about the approach. Cheers, Bryan

Tim-Salzmann commented 5 months ago

Hi Bryan,

another thing to think about is that during the optimization the model could be easily queried in input regions you did not train it. E.g., you trained your bicycle model for velocities of up to 10m/s, but the optimization could (temporarily) query the model for velocities of 100m/s (Out of distribution) where the response and, more noticeably, the gradients will be ill-defined. Similarly depending on the overfitting level of your model it could be accurate for velocities it was trained on e.g., 2m/s and 3m/s but the output and gradient could be wrong for intermediate values e.g., 2.5m/s.

Best Tim

baryn commented 5 months ago

Hello Tim,

Yes indeed, that's a relevant point to have in mind. I do not know if is it possible to check the queried inputs within the MPC. I set input constrains of course but that does not prevent the issue from appearing. I chose state training domains to be larger than the operated domains, so the NN would have to fit data belonging to a sub domain. I am trying to go back to more conventional approach where only model residuals are predicted by the NN and most of the model is described as analytic equations. It is a more secured approach especially regarding the gradient descent and sensitivities of the MPC.

Best, Bryan

Tim-Salzmann / l4casadi

MPC purely on a NN model #37