Performance downgrade in solving optimal control problems(OCP)

johnysyumich commented 9 months ago

Hi developers,

We are currently "new" users to JuMP but we do have experience in using JuMP since we were users of NLOptControl: https://github.com/JuliaMPC/NLOptControl.jl. This package relies very heavily on JuMP and stopped upgrading about three years ago. We are thinking of upgrading it to let it compile with the current version of JuMP. The last version of NLOptControl uses JuMP v0.18.6 which is the version right before the change from MPB to MOI. However, we met some performance issues when we were upgrading it, and here is one of the dummy examples we created to benchmark with the old package.

Here is the code we run: example.zip

Basically, it is controlling a modeled vehicle to change to another lane.

On my computer (I7-12700K), the algorithm can solve the problem with around 130ms. However, with the old implementation (NLOptControl & JuMP v0.18.6) we can achieve around 45ms per optimization on the same computer. With our current modifications, we cannot achieve the same performance as we could before. It would be very helpful if you would provide any insights about the difference between performance and how we can modify the current dummy code to work as fast as the previous version.

Thanks for your help in advance!

odow commented 9 months ago

I can't run your code because I don't know what PCY1 is.

As a general comment, JuMP v0.18.6 was released over four years ago. In the last few years we have made big improvements to our nonlinear code, so that the expression and eval stuff is not needed. See https://jump.dev/JuMP.jl/stable/manual/nonlinear/ and tutorials like https://jump.dev/JuMP.jl/stable/tutorials/nonlinear/rocket_control/. I should write an explicit MPC tutorial (x-ref https://github.com/jump-dev/JuMP.jl/issues/2348#issuecomment-1839210372).

You can ask for support and suggested code improvements on http://jump.dev/forum. We try to keep the GitHub issues for bug reports.

Here's how I'd write your code. I can't test it, but it should point you in the right direction:

using JuMP, Ipopt
# TODO start
PCY1 = 1.0
PDY1 = 1.0
PDY2 = 1.0
PEY1 = 1.0
PEY2 = 1.0
PEY3 = 1.0
PKY1 = 1.0
PKY2 = 1.0
PHY1 = 1.0
PHY2 = 1.0
PVY1 = 1.0
PVY2 = 1.0
FZ0 = 1.0
EP_SMALL = 1e-3
# TODO end
KZX     = 289.5
KZYR    = 427.0
KZYF    = 277.2
FzF0    = 2670.0
FzR0    = 2670.0
PC1     = PCY1
PD1     = PDY1 - PDY2
PD2     = PDY2/FZ0
PE1     = PEY1 - PEY2
PE2     = PEY2/FZ0
PE3     = PEY3
PK1     = PKY1*FZ0
PK2     = 1/(PKY2*FZ0)
PH1     = PHY1 - PHY2
PH2     = PHY2/FZ0
PV1     = PVY1 - PVY2
PV2     = PVY2/FZ0
g       = 9.81
T       = 0.002
la      = 1.5521
lb      = 2.715 - la  #1.1629
h_cg    = 0.596
m       = 1.269e+03
Izz     = 1.620e+03
FzF0    = 2670.0
FzR0    = 2670.0
L = 3.6
TF = 4
num_states = 8
num_controls = 2
state_name = [:xpos, :ypos, :v, :r, :psi, :sa, :ux, :ax]
control_name = [:sr, :jx]
XL = [-10, -3, -3, -pi, -pi, -pi/4, 0.01, -3]
XU = [200, 4, 3, pi, pi, pi/4, 10, 3]
X0 = [0, 0, 0, 0, 0.001, 0, 2, 0]
CL = [-pi/4 * 5, -2.6]
CU = [pi/4 * 5, 2.6]
n = 25
Δt = TF / n

function ThreeDOFBicycle_expr(x::Vector, u::Vector)
    xpos, ypos, v, r, psi, sa, ux, ax = x
    sr, jx = u
    FYF = (
        PD2 * (FzF0 - (ax - v*r)*KZX)^2 +
        PD1 * (FzF0 - (ax - v*r)*KZX)
    ) * sin(
        PC1 * atan(
            (((PK1*sin(2*atan(PK2*(FzF0 - (ax - v*r)*KZX))))/(((PD2*PC1*(FzF0 - (ax - v*r)*KZX)^2 + PD1*PC1*(FzF0 - (ax - v*r)*KZX)) + ((PD2*PC1*(FzF0 - (ax - v*r)*KZX)^2 + PD1*PC1*(FzF0 - (ax - v*r)*KZX)))/(((PD2*PC1*(FzF0 - (ax - v*r)*KZX)^2 + PD1*PC1*(FzF0 - (ax - v*r)*KZX))^2 + EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*(FzF0 - (ax - v*r)*KZX) + PH1)) - ((PE2*(FzF0 - (ax - v*r)*KZX) + PE1)*(1 - PE3)*(((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*(FzF0 - (ax - v*r)*KZX) + PH1))/((((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*(FzF0 - (ax - v*r)*KZX) + PH1)^2 + EP_SMALL^2)^(0.5)))*((((PK1*sin(2*atan(PK2*(FzF0 - (ax - v*r)*KZX))))/(((PD2*PC1*(FzF0 - (ax - v*r)*KZX)^2 + PD1*PC1*(FzF0 - (ax - v*r)*KZX)) + ((PD2*PC1*(FzF0 - (ax - v*r)*KZX)^2 + PD1*PC1*(FzF0 - (ax - v*r)*KZX)))/(((PD2*PC1*(FzF0 - (ax - v*r)*KZX)^2 + PD1*PC1*(FzF0 - (ax - v*r)*KZX))^2 + EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*(FzF0 - (ax - v*r)*KZX) + PH1)) - atan((((PK1*sin(2*atan(PK2*(FzF0 - (ax - v*r)*KZX))))/(((PD2*PC1*(FzF0 - (ax - v*r)*KZX)^2 + PD1*PC1*(FzF0 - (ax - v*r)*KZX)) + ((PD2*PC1*(FzF0 - (ax - v*r)*KZX)^2 + PD1*PC1*(FzF0 - (ax - v*r)*KZX)))/(((PD2*PC1*(FzF0 - (ax - v*r)*KZX)^2 + PD1*PC1*(FzF0 - (ax - v*r)*KZX))^2 + EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*(FzF0 - (ax - v*r)*KZX) + PH1)))))) + (PV2*(FzF0 - (ax - v*r)*KZX)^2 + PV1*(FzF0 - (ax - v*r)*KZX))
    FYR = (
        PD2 * (FzR0 + (ax - v*r)*KZX)^2 +
        PD1 * (FzR0 + (ax - v*r)*KZX)
    ) * sin(PC1*atan((((PK1*sin(2*atan(PK2*(FzR0 + (ax - v*r)*KZX))))/(((PD2*PC1*(FzR0 + (ax - v*r)*KZX)^2 + PD1*PC1*(FzR0 + (ax - v*r)*KZX)) + ((PD2*PC1*(FzR0 + (ax - v*r)*KZX)^2 + PD1*PC1*(FzR0 + (ax - v*r)*KZX)))/(((PD2*PC1*(FzR0 + (ax - v*r)*KZX)^2 + PD1*PC1*(FzR0 + (ax - v*r)*KZX))^2+EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*(FzR0 + (ax - v*r)*KZX) + PH1)) - ((PE2*(FzR0 + (ax - v*r)*KZX) + PE1)*(1 - PE3*(((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*(FzR0 + (ax - v*r)*KZX) + PH1))/((((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*(FzR0 + (ax - v*r)*KZX) + PH1)^2 + EP_SMALL^2)^(0.5))))*((((PK1*sin(2*atan(PK2*(FzR0 + (ax - v*r)*KZX))))/(((PD2*PC1*(FzR0 + (ax - v*r)*KZX)^2 + PD1*PC1*(FzR0 + (ax - v*r)*KZX)) + ((PD2*PC1*(FzR0 + (ax - v*r)*KZX)^2 + PD1*PC1*(FzR0 + (ax - v*r)*KZX)))/(((PD2*PC1*(FzR0 + (ax - v*r)*KZX)^2 + PD1*PC1*(FzR0 + (ax - v*r)*KZX))^2+EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*(FzR0 + (ax - v*r)*KZX) + PH1)) - atan((((PK1*sin(2*atan(PK2*(FzR0 + (ax - v*r)*KZX))))/(((PD2*PC1*(FzR0 + (ax - v*r)*KZX)^2 + PD1*PC1*(FzR0 + (ax - v*r)*KZX)) + ((PD2*PC1*(FzR0 + (ax - v*r)*KZX)^2 + PD1*PC1*(FzR0 + (ax - v*r)*KZX)))/(((PD2*PC1*(FzR0 + (ax - v*r)*KZX)^2 + PD1*PC1*(FzR0 + (ax - v*r)*KZX))^2+EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*(FzR0 + (ax - v*r)*KZX) + PH1)))))) + (PV2*(FzR0 + (ax - v*r)*KZX)^2 + PV1*(FzR0 + (ax - v*r)*KZX))
    return [
        ux * cos(psi) - (v + la * r) * sin(psi),
        ux * sin(psi) + (v + la * r) * cos(psi),
        (FYF + FYR) / m - r * ux,
        (la * FYF - lb * FYR) / Izz,
        r,
        sr,
        ax,
        jx,
    ]
end

simp_veh = Model(Ipopt.Optimizer)
set_silent(simp_veh)
@variables(simp_veh, begin
    XL[i] <= x[j in 1:n, i in 1:num_states] <= XU[i]
    CL[i] <= u[j in 1:n, i in 1:num_controls] <= CU[i]
end)
fix.(x[1, :], X0; force = true)
dx = [ThreeDOFBicycle_expr(x[j, :], u[j, :]) for j in 1:n]
@constraint(
    simp_veh,
    [st in 1:num_states, j in 1:n-1],
    x[j+1,st] == x[j,st] + Δt * dx[j+1][st],
)
@objective(
    simp_veh,
    Min,
    5*(x[n,1] - 100)^2 + sum(Δt * (x[i,2] - 3.6)^2 + 10 * Δt * u[i,1]^2 + Δt * x[i,3]^2 for i in 1:n)^2,
)
println("Solving...")
optimize!(simp_veh)
println("Min Obj: ", solve_time(simp_veh))
for i in 1:1:10
    optimize!(simp_veh)
    println("Min Obj: ", solve_time(simp_veh))
end

Note that the final for-loop isn't going to be interesting as a comparison to v0.18.6, because Ipopt will now detect that the problem structure hasn't changed and so it won't re-build the problem.

If you can provide a reproducible example, there is still plenty of room for improvement, because you use a lot of non-constant global variables.

johnysyumich commented 9 months ago

Thanks a lot for your prompt help! So sorry about the code not running. Apparently, I used the wrong parameter file for that. Here is the new code: example.zip

I added a new file named: SuggestedChange.jl in the zip file, that corresponds to the file you wrote. Actually, this is similar to where we started, we did follow the optimal control code used in the space shuttle case in the documentation to create the first working example of this code. We noticed no big computational speed difference between this version and the version we created before. The code you wrote also performs similarly to our versions of code as well.

I deleted the for loop at the end. Thanks for the reminder.

Do you think I should close this issue and submitted into the forum instead?

Thanks!

odow commented 9 months ago

So almost the entirety of the time is spent inside Ipopt.

julia> include("/Users/oscar/Downloads/example-3/SuggestedChange.jl");

julia> @time optimize!(simp_veh)
This is Ipopt version 3.14.13, running with linear solver MUMPS 5.6.1.

Number of nonzeros in equality constraint Jacobian...:      856
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:     3806

Total number of variables............................:      242
                     variables with only lower bounds:        0
                variables with lower and upper bounds:      242
                     variables with only upper bounds:        0
Total number of equality constraints.................:      192
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  5.2687386e+04 1.98e+00 1.27e+01  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  5.2142552e+04 1.52e+00 5.78e+01  -1.0 1.11e+01    -  6.17e-03 2.32e-01f  1
   2  4.4809330e+04 7.84e-01 1.28e+02  -1.0 1.32e+01    -  2.74e-02 4.85e-01f  1
   3  4.4825324e+04 7.63e-01 2.33e+02  -1.0 1.16e+01   2.0 1.96e-01 2.62e-02h  1
   4  4.4887259e+04 7.51e-01 5.22e+02  -1.0 4.76e+01   1.5 1.68e-02 1.56e-02h  1
   5  4.5037374e+04 7.32e-01 5.00e+02  -1.0 1.08e+01   1.9 3.07e-02 2.59e-02h  1
   6  4.5145788e+04 7.21e-01 4.97e+02  -1.0 5.50e+00   2.4 5.54e-02 1.55e-02h  1
   7  4.5322464e+04 7.03e-01 5.31e+02  -1.0 4.62e+00   2.8 8.29e-02 2.50e-02h  1
   8  4.5902522e+04 6.50e-01 3.77e+03  -1.0 3.93e+00   3.2 2.90e-02 7.47e-02h  1
   9  4.6165142e+04 6.13e-01 4.63e+03  -1.0 2.58e+00   3.7 7.68e-04 5.73e-02h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  5.1382069e+04 6.02e-01 6.20e+05  -1.0 1.67e+00   4.1 4.59e-03 6.10e-01h  1
  11  5.1381111e+04 6.01e-01 6.19e+05  -1.0 1.08e+00   4.5 3.46e-02 1.98e-03h  1
  12  5.1354339e+04 5.93e-01 6.11e+05  -1.0 1.24e+00   4.9 3.12e-01 1.25e-02f  1
  13  5.1335704e+04 5.86e-01 6.03e+05  -1.0 1.47e+00   4.5 3.65e-02 1.30e-02f  1
  14  5.1333051e+04 5.85e-01 9.33e+05  -1.0 9.73e-01   4.9 8.17e-01 9.61e-04h  1
  15  5.1148846e+04 4.90e-01 8.25e+05  -1.0 2.25e+00   4.4 5.03e-02 1.52e-01f  1
  16  5.1142248e+04 4.87e-01 7.54e+05  -1.0 6.88e-01   4.8 4.76e-01 4.98e-03f  1
  17  5.1126006e+04 4.77e-01 5.61e+05  -1.0 2.02e+00   4.4 5.11e-01 2.04e-02f  1
  18  4.9890876e+04 5.86e-01 2.32e+05  -1.0 7.44e-01   4.8 7.25e-01 8.07e-01f  1
  19  4.9925591e+04 5.68e-01 2.70e+05  -1.0 4.83e-01   4.3 7.83e-01 3.13e-02h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  20  5.0053364e+04 3.78e-01 1.69e+05  -1.0 3.21e-01   4.7 2.97e-02 2.66e-01h  1
  21  5.0506381e+04 6.61e-02 5.52e+04  -1.0 2.32e-01   4.2 7.39e-02 1.00e+00h  1
  22  5.0494651e+04 5.49e-02 4.58e+04  -1.0 4.03e-02   3.8 9.93e-01 1.69e-01f  1
  23  5.0576451e+04 2.95e-03 4.92e+03  -1.0 1.85e-01   3.3 7.28e-01 1.00e+00h  1
  24  5.0226007e+04 7.38e-04 1.48e+03  -1.0 1.12e-01   2.8 3.38e-01 1.00e+00f  1
  25  4.9331803e+04 2.97e-03 1.22e+03  -1.0 1.74e-01   2.3 7.17e-02 1.00e+00f  1
  26  4.7732720e+04 1.53e-02 8.67e+02  -1.0 3.61e-01   1.9 2.56e-01 1.00e+00f  1
  27  4.5584556e+04 4.35e-02 6.04e+02  -1.0 5.91e-01   1.4 3.33e-01 1.00e+00f  1
  28  4.2473591e+04 2.53e-01 1.21e+02  -1.0 1.67e+00   0.9 1.82e-02 1.00e+00f  1
  29  3.7025996e+04 4.41e-01 1.15e+02  -1.0 1.26e+02    -  2.13e-02 4.94e-02f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  30  3.6125121e+04 4.33e-01 1.13e+02  -1.0 5.77e+01    -  1.65e-01 1.80e-02f  1
  31  3.4914787e+04 4.21e-01 1.09e+02  -1.0 5.05e+01    -  3.86e-02 2.79e-02f  1
  32  3.3348088e+04 4.01e-01 1.04e+02  -1.0 4.66e+01    -  1.55e-02 4.00e-02f  1
  33  3.3032901e+04 3.98e-01 1.03e+02  -1.0 5.47e+01    -  3.64e-02 9.42e-03f  1
  34  3.2761250e+04 3.93e-01 1.02e+02  -1.0 8.28e+01    -  6.56e-03 1.09e-02f  1
  35  3.2339735e+04 3.84e-01 9.93e+01  -1.0 6.66e+01    -  4.60e-02 2.30e-02f  1
  36  3.2136360e+04 3.75e-01 9.68e+01  -1.0 4.09e+01    -  9.75e-03 2.47e-02f  1
  37  3.2032096e+04 3.68e-01 9.50e+01  -1.0 3.83e+01    -  5.04e-02 1.90e-02f  1
  38  3.1731457e+04 3.63e-01 9.36e+01  -1.0 4.46e+01    -  4.95e-03 1.36e-02f  1
  39  3.0659493e+04 3.51e-01 9.03e+01  -1.0 7.82e+01    -  3.41e-02 3.10e-02f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  40  3.0570456e+04 3.50e-01 8.98e+01  -1.0 4.21e+01    -  1.29e-02 5.00e-03f  1
  41  2.9909864e+04 3.42e-01 8.77e+01  -1.0 7.90e+01    -  1.83e-03 2.21e-02f  1
  42  2.9901397e+04 3.41e-01 8.75e+01  -1.0 1.10e+01    -  1.52e-01 1.87e-03f  1
  43  2.9679391e+04 2.58e-01 6.74e+01  -1.0 1.78e+00    -  1.00e+00 2.43e-01f  1
  44  2.9245325e+04 2.30e-01 2.38e+01  -1.0 1.71e+00    -  6.60e-01 1.00e+00f  1
  45  2.9246629e+04 3.94e-01 1.05e+02  -1.0 1.43e+00    -  1.63e-01 1.00e+00F  1
  46  2.9497012e+04 1.66e-01 3.37e+01  -1.0 5.15e-01   1.3 1.00e+00 1.00e+00h  1
  47  2.9368329e+04 2.97e-02 7.24e+00  -1.0 9.64e-01   0.9 5.32e-01 1.00e+00f  1
  48  2.9234741e+04 4.59e-01 3.50e+01  -1.0 1.21e+00    -  1.00e+00 1.00e+00f  1
  49  2.9263907e+04 3.37e-02 1.06e+01  -1.0 3.18e-01    -  1.00e+00 1.00e+00h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  50  2.9269190e+04 1.62e-03 1.89e-01  -1.0 1.44e-01    -  1.00e+00 1.00e+00h  1
  51  2.9255471e+04 6.31e-04 4.27e+00  -2.5 9.90e-02    -  9.70e-01 6.11e-01f  1
  52  2.9249081e+04 2.33e-06 1.38e+01  -2.5 7.75e-02    -  5.77e-01 1.00e+00f  1
  53  2.9248260e+04 3.75e-08 1.36e-07  -2.5 7.78e-02    -  1.00e+00 1.00e+00f  1
  54  2.9247633e+04 8.90e-09 3.28e-03  -3.8 1.07e-02    -  9.96e-01 1.00e+00f  1
  55  2.9247632e+04 1.51e-12 2.56e-12  -3.8 3.47e-04    -  1.00e+00 1.00e+00h  1
  56  2.9247596e+04 2.92e-11 2.62e-09  -5.7 7.92e-04    -  1.00e+00 1.00e+00f  1
  57  2.9247596e+04 4.64e-15 4.07e-13  -8.6 1.08e-05    -  1.00e+00 1.00e+00f  1

Number of Iterations....: 57

                                   (scaled)                 (unscaled)
Objective...............:   2.9247595956736363e+03    2.9247595956736361e+04
Dual infeasibility......:   4.0704140571740116e-13    4.0704140571740116e-12
Constraint violation....:   2.2204460492503131e-15    4.6351811278100286e-15
Variable bound violation:   9.9948630705171126e-08    9.9948630705171126e-08
Complementarity.........:   2.5389307701701216e-09    2.5389307701701216e-08
Overall NLP error.......:   2.5389307701701216e-09    2.5389307701701216e-08

Number of objective function evaluations             = 59
Number of objective gradient evaluations             = 58
Number of equality constraint evaluations            = 59
Number of inequality constraint evaluations          = 0
Number of equality constraint Jacobian evaluations   = 58
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations             = 57
Total seconds in IPOPT                               = 0.243

EXIT: Optimal Solution Found.
  0.273037 seconds (405.06 k allocations: 26.364 MiB)

You can improve things by using a starting solution for x:

    XL[i] <= x[j in 1:n, i in 1:num_states] <= XU[i], (start = X0[I])

julia> include("/Users/oscar/Downloads/example-3/SuggestedChange.jl");

julia> @time optimize!(simp_veh)
This is Ipopt version 3.14.13, running with linear solver MUMPS 5.6.1.

Number of nonzeros in equality constraint Jacobian...:      856
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:     3806

Total number of variables............................:      242
                     variables with only lower bounds:        0
                variables with lower and upper bounds:      242
                     variables with only upper bounds:        0
Total number of equality constraints.................:      192
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  5.2687386e+04 3.20e-01 1.28e+01  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  4.2241665e+04 2.23e-01 9.42e+00  -1.0 3.39e+01    -  1.36e-01 3.03e-01f  1
   2  3.7877395e+04 1.97e-01 8.26e+00  -1.0 4.07e+01    -  2.54e-01 1.16e-01f  1
   3  3.3910319e+04 1.62e-01 6.84e+00  -1.0 2.46e+01    -  3.64e-01 1.81e-01f  1
   4  3.2958787e+04 1.37e-01 9.30e+00  -1.0 6.15e+00    -  4.83e-01 1.55e-01f  1
   5  3.1095547e+04 7.57e-02 7.64e+00  -1.0 4.20e+00    -  6.41e-01 4.45e-01f  1
   6  2.9852528e+04 2.69e-02 2.91e+00  -1.0 1.93e+00    -  6.76e-01 6.45e-01f  1
   7  2.9264516e+04 8.57e-04 8.50e-01  -1.0 6.98e-01    -  8.31e-01 1.00e+00f  1
   8  2.9254891e+04 3.21e-04 4.54e+00  -1.7 1.49e-01    -  1.00e+00 6.43e-01f  1
   9  2.9252767e+04 6.24e-06 4.00e+00  -1.7 6.49e-02    -  8.26e-01 1.00e+00h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  2.9252279e+04 3.25e-08 4.98e-07  -1.7 4.30e-02    -  1.00e+00 1.00e+00h  1
  11  2.9248376e+04 3.21e-07 3.98e-01  -3.8 4.97e-02    -  8.00e-01 8.52e-01f  1
  12  2.9247631e+04 1.90e-08 1.14e-06  -3.8 2.69e-02    -  1.00e+00 1.00e+00f  1
  13  2.9247596e+04 2.88e-11 2.90e-09  -5.7 6.91e-04    -  1.00e+00 1.00e+00f  1
  14  2.9247596e+04 4.59e-15 4.11e-13  -8.6 1.08e-05    -  1.00e+00 1.00e+00f  1

Number of Iterations....: 14

                                   (scaled)                 (unscaled)
Objective...............:   2.9247595956736368e+03    2.9247595956736368e+04
Dual infeasibility......:   4.1060281556583376e-13    4.1060281556583376e-12
Constraint violation....:   4.5935477643865852e-15    4.5935477643865852e-15
Variable bound violation:   9.9948630705171126e-08    9.9948630705171126e-08
Complementarity.........:   2.5390603669519614e-09    2.5390603669519612e-08
Overall NLP error.......:   2.5390603669519614e-09    2.5390603669519612e-08

Number of objective function evaluations             = 15
Number of objective gradient evaluations             = 15
Number of equality constraint evaluations            = 15
Number of inequality constraint evaluations          = 0
Number of equality constraint Jacobian evaluations   = 15
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations             = 14
Total seconds in IPOPT                               = 0.055

EXIT: Optimal Solution Found.
  0.087244 seconds (390.34 k allocations: 25.993 MiB, 3.46% compilation time)

Do you have the log of the old solves that were taking 45ms? This suggests that you were probably doing something else, or that you were using a different version of Ipopt.

Here are the changes I made:

using JuMP, Ipopt
include("parameter.jl")

function ThreeDOFBicycle_expr(x::Vector, u::Vector)
    xpos, ypos, v, r, psi, sa, ux, ax = x
    sr, jx = u
    tmp_1 = (FzF0 - (ax - v*r)*KZX)
    tmp_2 = (FzR0 + (ax - v*r)*KZX)
    FYF = (PD2 * tmp_1^2 + PD1 * tmp_1) * sin(PC1*atan((((PK1*sin(2*atan(PK2*tmp_1)))/(((PD2*PC1*tmp_1^2 + PD1*PC1*tmp_1) + ((PD2*PC1*tmp_1^2 + PD1*PC1*tmp_1))/(((PD2*PC1*tmp_1^2 + PD1*PC1*tmp_1)^2 + EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*tmp_1 + PH1)) - ((PE2*tmp_1 + PE1)*(1 - PE3)*(((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*tmp_1 + PH1))/((((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*tmp_1 + PH1)^2 + EP_SMALL^2)^(0.5)))*((((PK1*sin(2*atan(PK2*tmp_1)))/(((PD2*PC1*tmp_1^2 + PD1*PC1*tmp_1) + ((PD2*PC1*tmp_1^2 + PD1*PC1*tmp_1))/(((PD2*PC1*tmp_1^2 + PD1*PC1*tmp_1)^2 + EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*tmp_1 + PH1)) - atan((((PK1*sin(2*atan(PK2*tmp_1)))/(((PD2*PC1*tmp_1^2 + PD1*PC1*tmp_1) + ((PD2*PC1*tmp_1^2 + PD1*PC1*tmp_1))/(((PD2*PC1*tmp_1^2 + PD1*PC1*tmp_1)^2 + EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v + la*r)/(ux+EP_SMALL)) - sa) + PH2*tmp_1 + PH1)))))) + (PV2*tmp_1^2 + PV1*tmp_1);
    FYR = (PD2 * tmp_2^2 + PD1 * tmp_2) * sin(PC1*atan((((PK1*sin(2*atan(PK2*tmp_2)))/(((PD2*PC1*tmp_2^2 + PD1*PC1*tmp_2) + ((PD2*PC1*tmp_2^2 + PD1*PC1*tmp_2))/(((PD2*PC1*tmp_2^2 + PD1*PC1*tmp_2)^2+EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*tmp_2 + PH1)) - ((PE2*tmp_2 + PE1)*(1 - PE3*(((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*tmp_2 + PH1))/((((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*tmp_2 + PH1)^2 + EP_SMALL^2)^(0.5))))*((((PK1*sin(2*atan(PK2*tmp_2)))/(((PD2*PC1*tmp_2^2 + PD1*PC1*tmp_2) + ((PD2*PC1*tmp_2^2 + PD1*PC1*tmp_2))/(((PD2*PC1*tmp_2^2 + PD1*PC1*tmp_2)^2+EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*tmp_2 + PH1)) - atan((((PK1*sin(2*atan(PK2*tmp_2)))/(((PD2*PC1*tmp_2^2 + PD1*PC1*tmp_2) + ((PD2*PC1*tmp_2^2 + PD1*PC1*tmp_2))/(((PD2*PC1*tmp_2^2 + PD1*PC1*tmp_2)^2+EP_SMALL^2)^(0.5))*0.001)+EP_SMALL))*((atan((v - lb*r)/(ux+EP_SMALL))) + PH2*tmp_2 + PH1)))))) + (PV2*tmp_2^2 + PV1*tmp_2);
    return (
        ux * cos(psi) - (v + la * r) * sin(psi),
        ux * sin(psi) + (v + la * r) * cos(psi),
        (FYF + FYR) / m - r * ux,
        (la * FYF - lb * FYR) / Izz,
        r,
        sr,
        ax,
        jx,
    )
end

L = 3.6
TF = 4
num_states = 8
num_controls = 2
XL = [-10, -3, -3, -pi, -pi, -pi/4, 0.01, -3]
XU = [200, 4, 3, pi, pi, pi/4, 10, 3]
X0 = [0, 0, 0, 0, 0.001, 0, 2, 0]
CL = [-pi/4 * 5, -2.6]
CU = [pi/4 * 5, 2.6]
n = 25
Δt = TF / n

simp_veh = Model(Ipopt.Optimizer)
set_silent(simp_veh)
@variables(simp_veh, begin
    XL[i] <= x[j in 1:n, i in 1:num_states] <= XU[i], (start = X0[i])
    CL[i] <= u[j in 1:n, i in 1:num_controls] <= CU[i]
end)
fix.(x[1, :], X0; force = true)
dx = [ThreeDOFBicycle_expr(x[j, :], u[j, :]) for j in 1:n]
@constraint(
    simp_veh,
    [st in 1:num_states, j in 1:n-1],
    x[j+1,st] == x[j,st] + Δt * dx[j+1][st],
)
@objective(
    simp_veh,
    Min,
    5*(x[n,1] - 100)^2 + sum(Δt * (x[i,2] - 3.6)^2 + 10 * Δt * u[i,1]^2 + Δt * x[i,3]^2 for i in 1:n)^2,
)
@time optimize!(simp_veh)

Do you think I should close this issue and submitted into the forum instead?

Yes, please. It'll probably just be me replying, but there are a bunch of other folks interested in MPC that might have suggestions.

johnysyumich commented 9 months ago

Thanks again for your answer. Yes, we do have the log for our old version in this case. There is a slight difference between that version and the log you shared. We have 260 variables since we took the first states as variables and added constraints on them instead of making them fixed. So in total, we have 26 * (8+2) variables in the model. The Ipopt we are using is 3.13.4.

This is Ipopt version 3.13.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).

Number of nonzeros in equality constraint Jacobian...:      908
Number of nonzeros in inequality constraint Jacobian.:        0
Number of nonzeros in Lagrangian Hessian.............:     4001

Total number of variables............................:      260
                     variables with only lower bounds:        0
                variables with lower and upper bounds:      260
                     variables with only upper bounds:        0
Total number of equality constraints.................:      208
Total number of inequality constraints...............:        0
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        0

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  5.2687386e+04 1.98e+00 1.27e+01  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  5.2137835e+04 1.52e+00 5.69e+01  -1.0 1.11e+01    -  5.24e-03 2.31e-01f  1
   2  4.5712660e+04 8.33e-01 1.29e+02  -1.0 1.24e+01    -  3.13e-02 4.53e-01f  1
   3  4.5520958e+04 8.29e-01 1.28e+02  -1.0 3.75e+01   0.0 2.78e-02 5.80e-03f  1
   4  4.5446856e+04 8.27e-01 1.28e+02  -1.0 4.53e+01    -  4.24e-02 1.59e-03f  1
   5  4.3823735e+04 7.99e-01 1.25e+02  -1.0 4.55e+01    -  2.44e-02 3.47e-02f  1
   6  3.7465291e+04 6.99e-01 1.11e+02  -1.0 5.44e+01    -  1.37e-02 1.25e-01f  1
   7  3.6301895e+04 6.80e-01 1.10e+02  -1.0 4.87e+01    -  1.69e-01 2.68e-02f  1
   8  3.2011416e+04 5.90e-01 3.44e+02  -1.0 3.95e+01    -  1.49e-01 1.33e-01f  1
   9  3.2053681e+04 5.86e-01 3.41e+02  -1.0 8.45e+02  -0.5 7.15e-02 7.43e-03H  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  3.5838617e+04 5.77e-01 8.05e+02  -1.0 1.36e+03  -0.1 2.85e-03 1.48e-02H  1
  11  3.3538606e+04 4.64e-01 6.49e+02  -1.0 1.28e+01    -  1.99e-01 1.96e-01f  1
  12  3.3186419e+04 3.96e-01 5.58e+02  -1.0 3.92e+00    -  1.19e-01 1.45e-01f  1
  13  3.3022841e+04 2.79e-01 3.65e+02  -1.0 3.09e+00    -  1.06e-01 2.96e-01f  1
  14  3.3983834e+04 4.58e-01 5.74e+02  -1.0 4.77e+00    -  5.77e-02 4.23e-01h  1
  15  3.2364639e+04 4.56e-01 6.53e+02  -1.0 3.06e+00    -  7.94e-02 6.64e-01f  1
  16  3.0758036e+04 2.27e-01 3.16e+02  -1.0 1.16e+00    -  2.12e-01 5.08e-01f  1
  17  2.9683837e+04 6.06e-01 5.08e+02  -1.0 3.94e+00    -  8.19e-02 7.44e-01f  1
  18  2.9652921e+04 6.04e-01 4.93e+02  -1.0 3.93e+00    -  8.29e-02 3.16e-02F  1
  19  2.9392876e+04 3.04e-01 3.80e+02  -1.0 2.55e+00    -  4.78e-03 7.73e-01f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  20  2.8470941e+04 2.06e-01 1.47e+02  -1.0 9.71e-01    -  8.11e-01 1.00e+00f  1
  21  2.8229700e+04 9.87e-01 9.41e+01  -1.0 3.05e+00    -  1.30e-01 1.00e+00f  1
  22  2.9911219e+04 2.14e-01 1.43e+02  -1.0 9.49e+00   0.4 5.77e-02 6.49e-01H  1
  23  2.8544933e+04 4.58e-01 2.21e+02  -1.0 1.50e+00  -0.1 1.56e-01 1.00e+00f  1
  24  2.8256544e+04 3.68e-01 2.40e+02  -1.0 8.94e-01   1.2 1.00e+00 1.00e+00f  1
  25  2.8322444e+04 3.18e-01 2.00e+02  -1.0 1.42e+00   0.7 1.00e+00 1.00e+00H  1
  26  2.8316481e+04 1.88e-01 2.15e+02  -1.0 1.16e+01   0.3 1.84e-01 1.76e-01f  2
  27  2.8076464e+04 5.94e-01 8.27e+01  -1.0 7.20e+00    -  6.56e-01 4.24e-01f  1
  28  2.8071359e+04 7.39e-02 3.84e+01  -1.0 1.23e+00    -  7.91e-01 1.00e+00f  1
  29  2.8071889e+04 6.70e-03 3.97e+00  -1.0 1.45e-01  -0.2 1.00e+00 1.00e+00f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  30  2.8072063e+04 5.82e-01 1.22e+00  -1.0 2.46e+00    -  5.36e-01 1.00e+00f  1
  31  2.8068266e+04 2.19e-02 1.07e+00  -1.0 3.15e-01    -  1.00e+00 1.00e+00f  1
  32  2.8070299e+04 3.63e-03 1.48e-01  -1.0 4.13e-01    -  1.00e+00 1.00e+00h  1
  33  2.8054546e+04 1.38e-03 2.53e+00  -2.5 4.79e-01    -  8.44e-01 6.17e-01f  1
  34  2.8047056e+04 1.77e-06 1.45e+01  -2.5 1.44e-01    -  5.82e-01 1.00e+00f  1
  35  2.8046240e+04 1.46e-07 2.00e-07  -2.5 1.11e-01    -  1.00e+00 1.00e+00f  1
  36  2.8045583e+04 9.75e-09 8.43e-03  -3.8 1.24e-02    -  9.89e-01 1.00e+00f  1
  37  2.8045581e+04 2.87e-12 8.51e-12  -3.8 7.34e-04    -  1.00e+00 1.00e+00h  1
  38  2.8045544e+04 3.23e-11 3.66e-09  -5.7 8.09e-04    -  1.00e+00 1.00e+00f  1
  39  2.8045543e+04 5.10e-15 5.68e-13  -8.6 1.11e-05    -  1.00e+00 1.00e+00f  1

Number of Iterations....: 39

                                   (scaled)                 (unscaled)
Objective...............:   2.8045543443519291e+03    2.8045543443519291e+04
Dual infeasibility......:   5.6793693474002096e-13    5.6793693474002096e-12
Constraint violation....:   2.9004576518332215e-15    5.1000870193718129e-15
Complementarity.........:   2.5392761841331811e-09    2.5392761841331811e-08
Overall NLP error.......:   2.5392761841331811e-09    2.5392761841331811e-08

Number of objective function evaluations             = 49
Number of objective gradient evaluations             = 40
Number of equality constraint evaluations            = 49
Number of inequality constraint evaluations          = 0
Number of equality constraint Jacobian evaluations   = 40
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations             = 39
Total CPU secs in IPOPT (w/o function evaluations)   =      0.161
Total CPU secs in NLP function evaluations           =      0.117

EXIT: Optimal Solution Found.
  0.313398 seconds (217.62 k allocations: 28.802 MiB)

The printing in our version seems to take time though, with print_level = 0, it does finish in around 50 ms.

Thanks for the advice about the start value, yes, it makes a huge difference in this example. However, when we insert this part to our new version of NLOptControl used for another application, it still does not make things faster. Do you have access to JuMP 0.18.6? We can write a dummy example for our real applications in both versions and see where the difference comes from.

We will close the issue now and move to the forum.

Thanks again.

odow commented 9 months ago

Do you have access to JuMP 0.18.6? We can write a dummy example for our real applications in both versions and see where the difference comes from.

Just a note: JuMP has been almost entirely (>95%) rewritten between v0.18.6 and the current v.1.17.0. Performance should be similar, but we do not expect or benchmark that the latest release of JuMP is faster than v0.18.6.

johnysyumich commented 9 months ago

Similar performance in optimization would suffice in our application. However, the getvalue function in old JuMP is taking a long time and is currently becoming our bottleneck. We saw that the new function (value) is doing a good job so we decided to upgrade everything.

jump-dev / JuMP.jl

Performance downgrade in solving optimal control problems(OCP) #3622