StanfordASL / neural-network-lyapunov

Synthesizing neural-network Lyapunov functions (and controllers) as stability certificate.
MIT License
139 stars 30 forks source link

Add PoleReluSystem. #401

Closed hongkai-dai closed 2 years ago

hongkai-dai commented 2 years ago

The system approximates the dynamics of pole/end-effector system using ReLU networks


This change is Reviewable

lujieyang commented 2 years ago

neural_network_lyapunov/examples/pole/pole_relu_system.py, line 17 at r1 (raw file):

    [ẋ_A[n+1] ẏ_A[n+1] ż_A[n+1] ẋ_AB[n+1] ẏ_AB[n+1]] =
    [ẋ_A[n] ẏ_A[n] ż_A[n] ẋ_AB[n] ẏ_AB[n]] +
    ϕ(x_AB[n], y_AB[n], ẋ_AB[n], ẏ_AB[n], u[n]) -

\phi is absorbing dt. Therefore, we have to train \phi for different dt.

lujieyang commented 2 years ago

neural_network_lyapunov/examples/pole/pole_relu_system.py, line 133 at r1 (raw file):

                               self.u_equilibrium))),
            sense=gurobipy.GRB.EQUAL,
            name="quadrotor_dynamics_output")

pole_dynamics_output

lujieyang commented 2 years ago

neural_network_lyapunov/examples/pole/pole_relu_system.py, line 17 at r1 (raw file):

Previously, hongkai-dai (Hongkai Dai) wrote…
Right and that is by intention. I think you are suggesting to represent the continuous dynamics xdot = f(x, u) using a network phi. But what we want is the discrete dynamics x[n+1]. If we use first-order explicit Euler integration as x[n+1] = x[n] + xdot[n] * dt then it is true we only need to learn xdot = f(x, u), and then multiplies dt. But first order explicit Euler integration has very bad accuracy, so it is more proper to use higher order integrations. Then it is not sufficient to just learn xdot = f(x, u), but to learn the discrete dynamics x[n+1] = F(x[n], u[n]) directly, which handles the time integration inside the dynamics update F(x[n], u[n]).

Is there advantage of learning residual dynamics over the entire forward dynamics?

lujieyang commented 2 years ago

neural_network_lyapunov/examples/pole/test/test_pole_relu_system.py, line 54 at r1 (raw file):

            x_next_i = dut.step_forward(x_samples[i], u_samples[i])
            np.testing.assert_allclose(x_next[i].detach().numpy(),
                                       x_next_i.detach().numpy())

I'm not quite seeing the purpose of checking that they are the same. It seems they are going through the same function with one in batch and another with single data point.