SciML / ModelingToolkit.jl

An acausal modeling framework for automatically parallelized scientific machine learning (SciML) in Julia. A computer algebra system for integrated symbolics for physics-informed machine learning and automated transformations of differential equations
https://mtk.sciml.ai/dev/
Other
1.43k stars 209 forks source link

generate_control_function very slow for bigger systems (1000 - 2000 equations) #3077

Closed 1-Bart-1 closed 1 month ago

1-Bart-1 commented 1 month ago

The 🐞

When trying to run generate_control_function with a system that has 1757 equations, it is several orders of magnitude slower than just running structural_simplify with the same arguments on the same system.

Expected behavior

structural_simplify and generate_control_function should have around the same runtime. When I run profileview on the system, it shows that generate_control_function uses around 60% of its runtime on the following line: inputoutput.jl, generate_control_function: sys, _ = io_preprocessing(sys, inputs, []; simplify, kwargs...)

and 40% of its runtime is used on: inputoutput.jl, generate_control_function: eqs = [eq for eq in full_equations(sys)]

With an impressive total runtime of more than 8000 seconds (more than 2 hours):

@time (_, f_ip), dvs, psym, io_sys = ModelingToolkit.generate_control_function(model, inputs, split=false; outputs=outputs)
8306.809924 seconds (5.55 G allocations: 260.320 GiB, 80.75% gc time)

While the runtime of structural_simplify on the same model is around 5-10 seconds:

@time structural_simplify(kite_model, (inputs, []))
  5.424258 seconds (12.56 M allocations: 717.108 MiB, 2.76% gc time)

How to reproduce

git clone https://github.com/Albatross-Kite-Transport/KitePredictiveControl.jl.git
cd KitePredictiveControl.jl
git checkout issue_3077
julia --project=.
julia> ] instantiate
julia> ] add https://github.com/ufechner7/KiteModels.jl/tree/feat/xfoil-polars
julia> using KitePredictiveControl
julia> run_controller()
running structural_simplify
  8.340810 seconds (11.99 M allocations: 710.245 MiB, 2.52% gc time, 81.88% compilation time: 33% of which was recompilation)
running generate_control_function
  8306.809924 seconds (5.55 G allocations: 260.320 GiB, 80.75% gc time)

My manifest.toml is in the repository under Manifest.toml.bug. The profview file is in data/kitemodels.jlprof

baggepinnen commented 1 month ago

Is structural_simplify(kite_model, (inputs, []), split=false) also fast, or is the slowness due to the split=false?

1-Bart-1 commented 1 month ago

I cloned ModelingToolkit.jl and added some @time functions. This shows that the problem is not structural_simplify, but full_equations. structural_simplify runs in less than 2 seconds. So this problem has nothing to do with split.

baggepinnen commented 1 month ago

I'm not sure if there's a good way to improve upon full_equations, the symbolic approach used in MTK is known to be rather slow :/

ChrisRackauckas commented 1 month ago

It's a consequence of substitution. The solution requires an IR instead of substituting equations as it's an expression growth problem, so it would probably require a JuliaSimCompiler version. Maybe the DAG improvements in Symbolics could make a dent.

1-Bart-1 commented 1 month ago

Should I close this issue for now, as there is not much that can be done with this function specifically?

ChrisRackauckas commented 1 month ago

That's the recommendation, that a JuliaSimCompiler version of the function handles it, since I think any version which is purely symbolic for constructing a large function like this is pretty doomed. It's the same issue of symbolic diff vs autodiff, it's structural.

baggepinnen commented 1 month ago

The JuliaSimCompiler.jl version of this function was merged yesterday, were just waiting for a new release

ChrisRackauckas commented 1 month ago

We should probably add it to the docs here

baggepinnen commented 1 month ago

A new release of JSCompiler has been made, if you try your model with JSCompiler's version of generate_control_function please report back here if it was any noticeable difference 😊

1-Bart-1 commented 1 month ago
  @time f_ip, dvs, psym, io_sys = ModelingToolkit.generate_control_function(IRSystem(model), inputs)
 10.705200 seconds (10.94 M allocations: 710.725 MiB, 2.29% gc time, 94.28% compilation time: 27% of which was recompilation)

That's more than 700 times faster! Thanks for the help :1st_place_medal: