boisgera / control-engineering-with-python

Control Engineering with Python
Creative Commons Attribution 4.0 International
65 stars 33 forks source link

Optimal Control, Inverted Pendulum #15

Open boisgera opened 3 years ago

boisgera commented 3 years ago

Consider the linearized dynamics of the inverted pendulum, described by the matrices A and B:

g=9.81
l=1
m=1
b=0
J = m * l * l

A = np.array([[0.0, 1.0], [g/l, -b/J]])
B = np.array([[0.0],[1/J]])

Select reference parameters Q and R, then solve the Riccati equation to get the optimal gain:

Q = array([[1.0, 0], [0, 1.0]])
R = array([[1.0]])
Pi = solve_continuous_are(A, B, Q, R)
K = inv(R) @ B.T @ Pi

We get K == [[19.67083668 6.35150953]]. At first sight, these coefficients appear to be almost insensitive with respect to Q and R, which is puzzling to say the least ; for example, with

Q = array([[1e-15, 0], [0, 1e-15]])
R = array([[1.0]])

we get K == [[19.62 6.26418391]]. However, this is true only for decreasing Qs (and/or increasing Rs) ; if instead we increase Q to get a faster convergence of the closed-loop system, say:

Q = array([[1e15, 0], [0, 1e15]])
R = array([[1.0]])

we get K: [[31622786.41168933 31622777.60168409]]. Note however that its probably unwise to have both Q[0,0] and Q[1,1] pushed to infinity at the same rate : since a large Q[1, 1] prevents the angle velocity to have large values, such parameters won't allow a really fast convergence of the solution towards the equilibrium. And indeed, for the choice above, the eigenvalues of A - B @ K are [-1.00000000e+00 -3.16227766e+07], so the time constant is 1, despite the large coefficients. Instead, we can pick a large Q[0,0], but not a large Q[1, 1] :

Q = array([[1e15, 0], [0, 1.0]])
R = array([[1.0]])

We get K: [[3.16227864e+07 7.95270858e+03]] and the closed-loop eigenvalues [-3976.35429204+3976.35299563j -3976.35429204-3976.35299563j] (time constant: ~ 0.00025 seconds !!! Obviously far too fast, but you get the idea).

Complements about the corner case

Q ==[[0, 0], [0, 0]] does not satisfy the assumptions that we have made in the lectures: Q is not positive definite ; but you can check that there is an optimal solution to the problem anyway : it is u =0 ; it corresponds to Pi = 0 and thus K = 0. But the numerical approach does not give this answer. Instead we have K: [[19.62 6.26418391]]. I suspect that the reason is the following : here since Q is not definite positive the optimal solution does not asymptotically stabilizes the system ; but the numerical package always performs its search among the solutions that stabilizes the system. That would be consistent with the information that scipy.linalg.solve_continuous_are is solved by hamiltonian methods (see for example Ricatti Equations and Their Solution, theorem 14.5 : under mild assumptions -- that hold in our context -- there is a unique solution to the Ricatti equation that results in an asymptotically stable dynamics ; and here the scipy solution generates the eigenvalues [-3.13209195+5.3530231e-08j -3.13209195-5.3530231e-08j] so we do have this property).

boisgera commented 3 years ago

Specify in the slides that under the assumptions made, there is a single solution Pi to the Riccati equation that asymptotically stabilizes the closed-loop and that this one solves the optimal control problem. And that the scipy function does indeed compute this one.