Higher order scheme - Githubissues

jbcaillau commented 3 months ago

@PierreMartinon @gergaud hi, we may have a use case that requires higher order implicit schemes (Gauss...) It is connected to this discussion.

@rand-asswad @DJEMAwalid do not hesitate to add further comments on the application, e.g. a plot of the periodic control.

PierreMartinon commented 3 months ago

We did some work on this with Joseph (see the old issue https://github.com/control-toolbox/CTDirect.jl/issues/11 and the branch https://github.com/control-toolbox/CTDirect.jl/tree/old-runge-kutta).

We felt it may be better to move to a more abstract interface with the discretization first, since it does involve quite a bit of modifications.

You are certainly welcome to try on your side, dont hesitate if you have any questions :-)

PierreMartinon commented 2 months ago

Ok, we'll try this again with @joseph-gergaud. Hopefully the code is better structured now and should make it easier.

rand-asswad commented 2 months ago

Dear @PierreMartinon, @jbcaillau, and @joseph-gergaud,

Thank you for considering this issue and directing me to your previous work on RK schemes. I have tinkered around with the old-runge-kutta branch but did not succeed in running it due to divergence with dependencies over the past year. I also could not allocate sufficient time to resolve this issue myself, given that I had to move on to other research topics in order to respect my planned PhD timeline, as you can imagine.

I ended up implementing the same scheme (implicit trapezoidal rule, aka Crank-Nicholson) following your algorithmic guide using JuMP with Ipopt backend. I further implemented a general RK scheme in a similar manner, so I provide here my results.

1. Implicit trapezoidal rule (2 stages, order 2)

The control trajectory for the $d(t)$ variable obtained with CTDirect exhibited a chattering effect that I could not resolve by changing the number of time steps (up to 10000, which took a lot of time). While the implementation with JuMP and Ipopt did not exhibit the same phenomenon and resolved the problem for a similar number of steps (I use 5000).

Standalone scripts and plots for both implementations:

I bring this issue to your attention because the difference is significant, as you can see in the attached plots.

2. Higher order scheme: Gauss-Legendre method (2 stages, order 4)

I implemented a general RK scheme following your algorithmic guide and the previous RK branch. As expected, there is an improvement to the previous implementation, which becomes more apparent in the plots of the singular arcs (expressions derived from differentiating the switch functions, the expressions depend on the state and costate trajectories as well as the other control).

Here's my standalone script for the JuMP implementation of RK scheme and the control plot with the Gauss-Legendre method of order 4.

Takeaway

I could not identify a problem in my implementation based on CTDirect that explains the chattering effect. I believe this would be interesting to you.
A higher order scheme is needed for my application (system of algal-bacterial consortium in a bioreactor), especially for plotting the singular arcs. As you know, the magnitude of the local error for an order 4 method is 1e-12 for 5000 steps and $[t_0,t_f]=[0,20]$, in order to obtain the same magnitude with the trapezoidal rule one needs over 100,000 steps, which takes a lot of time and memory since there would be significantly more variables and constraints for the NL solver.

Finally, I would be happy to help with a general RK scheme implementation when I have more time at hand. For the time being, I will stop at my current code based on your previous work. Thank you again for all your help!

jbcaillau commented 2 months ago

hey @rand-asswad sounds super nice! 👍🏽currently at the JuliaCon, but will have a look at it asap. cheers from eindhoven to @PierreMartinon

rand-asswad commented 2 months ago

@jbcaillau enjoy JuliaCon and Eindhoven!

jbcaillau commented 2 months ago

@jbcaillau enjoy JuliaCon and Eindhoven!

https://control-toolbox.org/docs/optimalcontrol/stable/juliacon2024.html

rand-asswad commented 2 months ago

I updated the code gists and ran a fresh round of testing with the @timev macro. Now this is far from being a benchmark, but here's a brief time and memory comparison.

	CTDirect	JuMP (trapezoidal)	JuMP (gauss 2-stages)
Total Time	1375 seconds	131 seconds	77 seconds
Total Memory	259.8 GiB	2.952 GiB	3.314 GiB
IPOPT time	299 seconds	130 seconds	73 seconds
IPOPT iterations	711	514	126

Full logs:

CTDirect log

``` Method = (:direct, :adnlp, :ipopt) Total number of variables............................: 40008 variables with only lower bounds: 25005 variables with lower and upper bounds: 10002 variables with only upper bounds: 0 Total number of equality constraints.................: 30006 Total number of inequality constraints...............: 0 inequality constraints with only lower bounds: 0 inequality constraints with lower and upper bounds: 0 inequality constraints with only upper bounds: 0 Number of Iterations....: 711 (scaled) (unscaled) Objective...............: -5.4521746997497136e+00 -5.4521746997497136e+00 Dual infeasibility......: 1.2669287810692106e-07 1.2669287810692106e-07 Constraint violation....: 1.2167857832423579e-09 1.2167857832423579e-09 Variable bound violation: 1.4432578243628313e-08 1.4432578243628313e-08 Complementarity.........: 5.6888535490015873e-13 5.6888535490015873e-13 Overall NLP error.......: 1.2669287810692106e-07 1.2669287810692106e-07 Number of objective function evaluations = 1019 Number of objective gradient evaluations = 712 Number of equality constraint evaluations = 1019 Number of inequality constraint evaluations = 0 Number of equality constraint Jacobian evaluations = 712 Number of inequality constraint Jacobian evaluations = 0 Number of Lagrangian Hessian evaluations = 711 Total seconds in IPOPT = 298.661 EXIT: Solved To Acceptable Level. 1374.678779 seconds (4.28 G allocations: 259.828 GiB, 1.53% gc time, 1.57% compilation time: <1% of which was recompilation) elapsed time (ns): 1374678778648 gc time (ns): 21083500281 bytes allocated: 278988714100 pool allocs: 4282533416 non-pool GC allocs: 87363 malloc() calls: 139546 realloc() calls: 28 free() calls: 125536 minor collections: 519 full collections: 6 ```

JuMP trapezoidal scheme log

``` Solving... Total number of variables............................: 40008 variables with only lower bounds: 30006 variables with lower and upper bounds: 10002 variables with only upper bounds: 0 Total number of equality constraints.................: 30006 Total number of inequality constraints...............: 0 inequality constraints with only lower bounds: 0 inequality constraints with lower and upper bounds: 0 inequality constraints with only upper bounds: 0 Number of Iterations....: 514 (scaled) (unscaled) Objective...............: -5.4521787774223300e+00 5.4521787774223300e+00 Dual infeasibility......: 2.7070093382139750e-07 2.7070093382139750e-07 Constraint violation....: 7.9440598543811802e-10 7.9440598543811802e-10 Variable bound violation: 1.0419357376889593e-08 1.0419357376889593e-08 Complementarity.........: 6.0331033410813804e-12 6.0331033410813804e-12 Overall NLP error.......: 2.7070093382139750e-07 2.7070093382139750e-07 Number of objective function evaluations = 604 Number of objective gradient evaluations = 515 Number of equality constraint evaluations = 604 Number of inequality constraint evaluations = 0 Number of equality constraint Jacobian evaluations = 515 Number of inequality constraint Jacobian evaluations = 0 Number of Lagrangian Hessian evaluations = 514 Total seconds in IPOPT = 129.685 EXIT: Solved To Acceptable Level. The model was not solved correctly. Objective value = 5.45217877742233 131.290054 seconds (47.88 M allocations: 2.952 GiB, 0.70% gc time, 0.48% compilation time) elapsed time (ns): 131290054181 gc time (ns): 914295499 bytes allocated: 3169247016 pool allocs: 47845225 non-pool GC allocs: 805 malloc() calls: 30129 realloc() calls: 5 free() calls: 30090 minor collections: 4 full collections: 1 ```

JuMP Gauss 2-stages scheme log

``` ****************************************************************************** This program contains Ipopt, a library for large-scale nonlinear optimization. Ipopt is released as open source code under the Eclipse Public License (EPL). For more information visit https://github.com/coin-or/Ipopt ****************************************************************************** Total number of variables............................: 100000 variables with only lower bounds: 30000 variables with lower and upper bounds: 10000 variables with only upper bounds: 0 Total number of equality constraints.................: 90000 Total number of inequality constraints...............: 0 inequality constraints with only lower bounds: 0 inequality constraints with lower and upper bounds: 0 inequality constraints with only upper bounds: 0 Number of Iterations....: 126 (scaled) (unscaled) Objective...............: -5.4508521518739395e+00 5.4508521518739395e+00 Dual infeasibility......: 3.3405884072253212e-14 3.3405884072253212e-14 Constraint violation....: 8.5174922670461228e-16 8.5174922670461228e-16 Variable bound violation: 1.4433177764061611e-08 1.4433177764061611e-08 Complementarity.........: 5.0000009659606610e-13 5.0000009659606610e-13 Overall NLP error.......: 5.0000009659606610e-13 5.0000009659606610e-13 Number of objective function evaluations = 158 Number of objective gradient evaluations = 127 Number of equality constraint evaluations = 158 Number of inequality constraint evaluations = 0 Number of equality constraint Jacobian evaluations = 127 Number of inequality constraint Jacobian evaluations = 0 Number of Lagrangian Hessian evaluations = 126 Total seconds in IPOPT = 72.508 EXIT: Optimal Solution Found. 77.051224 seconds (53.68 M allocations: 3.314 GiB, 2.67% gc time, 3.71% compilation time) elapsed time (ns): 77051224113 gc time (ns): 2056842591 bytes allocated: 3557946568 pool allocs: 53631677 non-pool GC allocs: 43354 malloc() calls: 135 realloc() calls: 14 free() calls: 76 minor collections: 6 full collections: 5 ```

jbcaillau commented 2 months ago

ping @PierreMartinon

PierreMartinon commented 2 months ago

Hi, a quick question: did you run the optimization twice to remove the compilation time ? (you can set for instance 2 iterations max).

One suprise is the 100x larger memory usage. Can you limit the iterations to 50 for instance and see if the memory difference is the same ? Maybe we are doing some unnecessary allocations in CTDirect. I should redo a pass on that, there is a nice option when launching julia, that will output .mem files with memory allocations corresponding to each line of code.

Regarding the control oscillations, they can indeed happen especially on singular arcs, and increasing the number of time steps typically does not help. Based on my experience, changing the discretization formula (when possible...) may help, as reducing the number of time steps or setting a smaller tolerance for the NLP solver. In your case it would be interesting to find why Jump seems unaffected (unless it's a random thing).

jbcaillau commented 2 months ago

hi @PierreMartinon @rand-asswad thanks for the feedback. yes, mem allocation difference sounds huge and needs to be understood.

for benchmarking, the standard is to use @btime from BenchmarkTools.jl
make sure that you use OptimalControl.jl v0.9 (rewritten build of the constraints)
since we use sparsity analysis thanks to our colleagues of ADNLModels.jl to build Jacobians and Hessians, there should not be such a difference in mem
yes, it is important to see how these mem alloc scale when changing the grid_size

All in all, thanks @rand-asswad for the input on this!

rand-asswad commented 2 months ago

First of all, I'm having trouble with v0.9 even with the basic example from the tutorial.

using OptimalControl

@def ocp begin
    t ∈ [ 0, 1 ], time
    x ∈ R², state
    u ∈ R, control
    x(0) == [ -1, 0 ]
    x(1) == [ 0, 0 ]
    ẋ(t) == [ x₂(t), u(t) ]
    ∫( 0.5u(t)^2 ) → min
end

sol = solve(ocp)

I get the following error:

ERROR: LoadError: MethodError: no method matching init(::CTBase.OptimalControlModel{Autonomous, Fixed}, ::Symbol, ::Symbol; display::Bool, init::Tuple{})

I tried it multiple times on a fresh Julia install and the problem persisted with v0.9. I therefore used v0.7.8, and the tests on CTDirect I have provided are in the following environment:

(@v1.10) pkg> st
Status `~/.local/share/julia/environments/v1.10/Project.toml`
  [6e4b80f9] BenchmarkTools v1.5.0
  [b6b21f68] Ipopt v1.6.3
  [4076af6c] JuMP v1.22.2
  [b964fa9f] LaTeXStrings v1.3.1
⌃ [5f98b655] OptimalControl v0.7.8
  [91a5bcdd] Plots v1.40.5
Info Packages marked with ⌃ have new versions available and may be upgradable.

Since the memory issue is addressed in v0.9, the issue I brought might be obsolete. However, I couldn't figure out the issue in the new version (granted, I haven't checked the source code), I thought I'd wait out for the newer release.

This problem aside, a quick round of testing following your recommendations gave similar results (this time I only compared for the same discretization scheme). I ran the code entirely once to compile, relaxed the tol attribute for IPOPT, then:

Single run for 500 time steps
Benchmark for 50 time steps

	CTDirect	raw JuMP
Time (500 steps)	18.37 seconds	4.04 seconds
Memory (500 steps)	7.194 GiB	116.5 MiB
Benchmark Samples (50 steps)	8 samples	28 samples
Benchmark Time (50 steps)	719 ms	183 ms
Benchmark Memory (50 steps)	304.46 MiB	8.58 MiB

I'll rerun these tests when I solve the issue with v0.9. I wish I could be more of help, thank you both!

@PierreMartinon regarding the oscillations in the singular arcs, I tried reducing the number of time-steps as well as the tolerance. Unfortunately, I couldn't succeed in removing them with CTDirect.jl. I was also surprised to see that the result of the JuMP code didn't have these oscillations with the same numerical scheme. Nevertheless, we are going with the Gauss-Legendre scheme of order 4 in our article because it yields considerably better results as you said.

jbcaillau commented 1 month ago

@rand-asswad sorry for the late reply (been busy at JuliaCon 🥲). We have split things using extensions, you know need to write the following:

using OptimalControl
using NLPModelsIpopt # so as to use Ipopt
using Plots # so as to plot

@def ocp begin
    t ∈ [ 0, 1 ], time
    x ∈ R², state
    u ∈ R, control
    x(0) == [ -1, 0 ]
    x(1) == [ 0, 0 ]
    ẋ(t) == [ x₂(t), u(t) ]
    ∫( 0.5u(t)^2 ) → min
end

sol = solve(ocp)

plot(sol)

In between, there has been an update of ADNLPModels.jl (and sparsity detection in there): can you please re-run your previous tests with the latest version of OptimalControl.jl, now available from the general registry?

jbcaillau commented 1 month ago

hi @rand-asswad, did you give it a try?

@rand-asswad sorry for the late reply (been busy at JuliaCon 🥲). We have split things using extensions, you know need to write the following [...]

control-toolbox / CTDirect.jl

Higher order scheme #101

1. Implicit trapezoidal rule (2 stages, order 2)

2. Higher order scheme: Gauss-Legendre method (2 stages, order 4)

Takeaway