Open finmod opened 2 years ago
I will check it out later on, but I think the line
for (i, xi) in enumerate(eachcol(X))
DX[:, i] = keen_simplified(xi, [], ref_sol.t[i])
end
spits out the error, because keen_simplified
is a ODESystem and not a function.
So you should just use the ref_solution
or the interpolated states for creating DX
.
There are quite a few problems here.
First, there is an issue with keen_simplified = structural_simplify(keen)
that fails. I will raise it in MTK and bypass it using keen
instead.
Second, the lack of function is solved by using InterpolationMethod()
X = ref_sol[:,:] t = ref_sol.t
ddprob = ContinuousDataDrivenProblem(X, t, InterpolationMethod())
Continuous DataDrivenProblem{Float64}
Last, the model discovery can start as an implicit system but I am lacking sufficient docs on setting up the discovery process.
From the model described above, the basis should include a polynomial (degree 2 or 3) of @variables t ω(t) λ(t) d₁(t) y₁(t)
, the variables in X and an implicit basis formed with D(ω), D(λ), D(d₁) and D(ω)/ω, D(λ)/λ, D(d₁)/d₁. How to do this?
Next, the use of ImplicitOptimizer(STLSQ())
is a two-step solution where the explicit problem is solved first with STLSQ. Will the basis defined above work?
Sorry I have a lot of deadlines this week plus some additional project work on top.
Just a quick question w.r.t. the implicit identification: is the process description in the docs ( nonlinear / implicit : cartpole ) not sufficient? If not, what exactly is your problem with setting up the basis? I am asking to get an understanding what I can improve upon :).
Second: you need to define the partial derivatives as additional signals and derive before the datadriven problem. A simple way to include this would be to just use vcat on those derived properties ( either use Interpolation for u(x) and take the derivative or a DiffEqOperator) or add them as inputs of the basis. Then tag them via the implicit kwarg in the function argument.
Third the ordering of the data can be done in one or two ways: either include just a single trajectory of a discretized state of the PDE and assume the overall structure or vcat all of the signals as a single vector ( u[1], u[2] etc, where 1,2 are the discretized states ). Be aware that this might mess up the conditioning of the problem ( I expirenced this whiile working on a MWE to automate this ).
Note that a fully automatic way of doing thing is not possible right now, because even though we can write down partial derivatives they are just symbols in terms of data and not further processed.
"Just a quick question w.r.t. the implicit identification: is the process description in the docs ( nonlinear / implicit : cartpole ) not sufficient? If not, what exactly is your problem with setting up the basis? I am asking to get an understanding what I can improve upon :)."
The weakness in the docs is about forming and using X and DX throughout DataDrivenDiffEq. There should be a full page on X and DX from different angles: synthetic data source, external data source along the line of the Hudson Bay example of the UDE repository. The diagrams from https://royalsocietypublishing.org/doi/10.1098/rspa.2020.0279 may be a better explanation. This rejoins https://github.com/SciML/DataDrivenDiffEq.jl/issues/317 . The signature of ContinuousDataDrivenProblem
could be improved to give name and dimensions of problem. In short, a DAE example in the UDE paper would solve all of these issues.
Concerning the definition of basis, the model above contains polynomial of low order of the states with some of the states of negative order (-1 for 1/x or -2 for 1/x^2). My question is: how do you handle this type of basis? as ratio of polynomials, as differential of polynomials or as polynomials of changed variables.
In short, replacing the cartpole tutorial by the model above or the AkzoNobel benchmark https://benchmarks.sciml.ai/html/DAE/ChemicalAkzoNobel.html suggested to me by @ChrisRackauckas would be a better tutorial of the MTK , DAEProblem and ContinuousDataDrivenProblem sequence.
Thanks for your reply and the detailed explanation!
The weakness in the docs is about forming and using X and DX throughout DataDrivenDiffEq. There should be a full page on X and DX from different angles: synthetic data source, external data source along the line of the Hudson Bay example of the UDE repository. The diagrams from https://royalsocietypublishing.org/doi/10.1098/rspa.2020.0279 may be a better explanation. This rejoins #317 .
I think I'll implement the UDE examples as well in the docs, to make this more clear and provide a better workflow. Right now, the docs in general is a clear weakness of the overall package, since I am in too deep to provide an unbiased view and have a lot of day to day work, which is more on the dev side than the exposed usage and API. But adding the Lotka Volterra Examples form the UDE paper is an easy goal.
The signature of ContinuousDataDrivenProblem could be improved to give name and dimensions of problem.
There is definitely no harm in naming a problem, but its not clear to me what is the clear benefit in terms of usability of the package. I ( again, limited perspective ) assume that a user would either use a single problem per dataset in the recovery of the unknown equations or track them individually ( if multiple problems are created. )
In short, a DAE example in the UDE paper would solve all of these issues.
Along the lines of visibility it would have helped. At the time the UDE paper took shape, the implicit identification did not work reliably or was pretty new. I agree that more examples - at least in the docs - might be useful, but I think right now the UDE paper is dormant.
Concerning the definition of basis, the model above contains polynomial of low order of the states with some of the states of negative order (-1 for 1/x or -2 for 1/x^2). My question is: how do you handle this type of basis? as ratio of polynomials, as differential of polynomials or as polynomials of changed variables.
Within the scope of DataDrivenDiffEq, everything is handled as it would be inside ModelingToolkit, SymbolicUtils or Symbolics with the exception of detecting linear dependent terms inside the basis ( on demand ). I think its just transformed as is, so no further transforms are made and the resulting code might be as (in)efficient as one formulates the basis.
A simple example:
# Let X, DX, t be defined by some unknown process
prob = ContinuousDataDrivenProblem(X, t, DX =DX)
@variables x[1:size(x, 1)] t
basis = Basis([x; x.^2; inv(x); inv(x.^2)], x, iv = t)
Would define a basis in terms of X
. If I now want to include a transformed variables, I can either do this as a new state Z = [x[1], x[2], x[1]^(-1)]
Z = hcat(X, inv.(X[1:1,:]))
prob = ContinuousDataDrivenProblem(Z, t, DX =DX)
@variables z[1:size(z, 1)] t
basis = Basis([z; z.^2], z, iv = t)
Or as an external signal via the inputs u
u(x, p, t) = inv(x[1])
prob = ContinuousDataDrivenProblem(X, t, DX =DX, U = u)
@variables x[1:size(x, 1)] t u
basis = Basis([x; x.^2; u; u^2], x, controls = [u] iv = t)
I might be missing the point ( except of course for numerical efficiency of the examples ), but basically it is a WYSIWYG approach.
In short, replacing the cartpole tutorial by the model above or the AkzoNobel benchmark https://benchmarks.sciml.ai/html/DAE/ChemicalAkzoNobel.html suggested to me by @ChrisRackauckas would be a better tutorial of the MTK , DAEProblem and ContinuousDataDrivenProblem sequence.
I think it is a valid addition to the example section, but it might be even better to stick to the publication examples from the original paper ( either the SINDy-PI or Implicit SINDy ) where good trajectories are available. Also, this examples seems quite intense for the recovery and might not be easy to understand for all possible users ( I am a mechanical engineer, so the cart pole is by far the most accessible example for me in terms of DAE ). I think adding a third and forth example is definitely worth it. I would however start with the regulatory network.
I've added some more examples with #335 ( both for implicit systems and one prototypical PDE discovery for the heat equation ).
Additionally, I've found an error in the preprocessing of the implicit optimiser, which should also be fixed.
Naming of the problem is now possible ( but without further effect at the moment).
@AlCap23 Nice improvements in the WIP. Let me restate my MWE for a DAE system discovered from data. the Continous DataDriven(odae_sol)
is now working. The next issue arises with the syntax for defining the basis and whether it takes it directly from odae_sol or if I have define the basis ex novo as a polynomial in u.
using ModelingToolkit
using LinearAlgebra
using DataDrivenDiffEq
using OrdinaryDiffEq, Sundials
using Plots; gr()
ModelingToolkit.@parameters begin
ϕ₀ =0.0401
ϕ₁ =6.41e-05
η =0
k₀ =-0.0065
k₁ =-5.0
k₂ =20
α =0.025
δ =0.05
β =0.01
r =0.03
cor=3.0
fp =0.333
b =0.135
end
@variables begin
t
ω(t) = 0.75
λ(t) = 0.90
d₁(t) = 0.5
π(t)
Y(t) = 100.0
𝛎(t)
𝛟(t)
κ(t)
g(t)
end
Dₜ = Differential(t)
# Variable functions
𝛎 = (t, ω) -> (1/fp)*((1 - ω)/b)^η
𝛟 = (t, λ) -> -ϕ₀ + ϕ₁/((1 - λ)^2)
κ = (t, π) -> k₀ + exp(k₁ + k₂*π)
g = (t, ω, π) -> κ(t, π)/𝛎(t, ω) - δ
sfcsys = [
Dₜ(ω) ~ ω*(𝛟(t, λ) - α)
Dₜ(λ) ~ λ*(g(t, ω, π) - α - β)
Dₜ(d₁) ~ d₁*(r - κ(t, π)/𝛎(t, ω)) + ω - 1 + κ(t, π)
Dₜ(Y) ~ Y*g(t, ω, π)
0. ~ π - 1 + ω + r*d₁
]
ModelingToolkit.@named primalKeen = ModelingToolkit.ODESystem(sfcsys)
tspan = (0.0, 80.0)
rdKeen = structural_simplify(primalKeen)
full_equations(rdKeen)
dt=0.25
odaeprob = ODAEProblem(rdKeen,[],tspan)
odae_sol = solve(odaeprob,Tsit5(),abstol=1/10^14,reltol=1/10^14, saveat=dt, maxiters=1e7); # 11 ms
prob = ContinuousDataDrivenProblem(odae_sol)
@parameters t
# @variables ω(t) λ(t) d₁(t) π(t) Y(t)
@variables u[1:4](t)
Ψ = Basis([u; u[1]^2], [u; u[2]^2], [u; u[3]^2], u, iv = t)
#Ψ = Basis(polynomial_basis(u, 2), iv = t) #[u; u[1]^2], u, independent_variable = t)
res = solve(prob, Ψ, DMDPINV(), digits = 1)
system = result(res)
#md println(res) # hide
#md println(system) # hide
#md println(parameters(res)) # hide
# The underlying dynamics have been recovered correctly by the algorithm!
gives the error:
MethodError: no method matching Basis(::Vector{Num}, ::Vector{Num}, ::Vector{Num}, ::Symbolics.Arr{Num, 1}; iv=t) Closest candidates are: Basis(::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any) at C:\Users\Denis.julia\packages\DataDrivenDiffEq\hVSnE\src\basis\type.jl:50 got unsupported keyword argument "iv" Basis(::Vector{Equation}, ::Vector, ::Vector, ::Vector, ::Vector, ::Num, ::Function, ::Symbol, ::Vector{Basis}) at C:\Users\Denis.julia\packages\DataDrivenDiffEq\hVSnE\src\basis\type.jl:50 got unsupported keyword argument "iv" Basis(::AbstractVector, ::AbstractVector; parameters, iv, controls, observed, name, simplify, linear_independent, eval_expression, kwargs...) at C:\Users\Denis.julia\packages\DataDrivenDiffEq\hVSnE\src\basis\type.jl:72 ...
Stacktrace: [1] top-level scope @ In[3]:4 [2] eval @ .\boot.jl:373 [inlined] [3] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String) @ Base .\loading.jl:1196
To run
ContinuousDataDrivenProblem
on the model below that I want to run as an Implicit system discovery, I get : MethodError: objects of type ODESystem are not callable in forming the DX matrix. This happens with bothODEProblem
andDAEProblem
.using DataDrivenDiffEq using LinearAlgebra using ModelingToolkit using DifferentialEquations: solve using NonlinearSolve using OrdinaryDiffEq, Sundials using Plots; gr() using Plots.PlotMeasures using Random using Symbolics: scalarize using Latexify import ModelingToolkit: Interval, infimum, supremum
ModelingToolkit.@parameters begin
end
@variables begin t ω(t) = 0.75 λ(t) = 0.90 d₁(t) = 0.5 ι(t) y₁(t) = 0.735 Y(t) = 100.0 Φ(t) ξ(t) g(t)
end
D = Differential(t) tmax = 100.0
domains = [t ∈ Interval(0.0, tmax), ω ∈ Interval(0.0, 1.0), λ ∈ Interval(0.0, 1.0), d₁ ∈ Interval(0.0, 1.8), y₁ ∈ Interval(0.0, 1.0), Y ∈ Interval(0.0,10.0^5)]
Φ = -Φ₁ + Φ₂/((1. - λ)^2) ι = η₁(mω − 1.) ξ = ξ₁/((1. + ξ₂exp(ξ₃y₁))^(1/η₂)) g = (1. - ξ)C((1 - ω)/b)^(1/η) - δ - ω/((1 - ω)(1 + η))(Φ - α - (1 - γ)*ι)
sfc = [ D(ω) ~ ω((η/(1 + η))(Φ - α - (1 - γ)ι)) D(λ) ~ λ(((1. - ξ)C((1 - ω)/b)^(1/η) - (δ + β + α) -(1/((1 + η)(1 - ω)))(Φ - α - (1 - γ)ι))) D(d₁) ~ d₁(r - g - ι) + ξ - ω - re₁ D(Y) ~ Yg
keen_simplified = structural_simplify(keen)
tspan = (0.0, tmax) dt=0.25 mmprob = ODEProblem(keen_simplified, [], tspan) sol = solve(mmprob, Rodas4(),abstol=1/10^14,reltol=1/10^14,saveat=dt);
du = mmprob.f(mmprob.u0,mmprob.p,0.0) du0 = D.(states(keen_simplified)) .=> du daeprob = DAEProblem(keen_simplified,du0,[],tspan) ref_sol = solve(daeprob,IDA(),abstol=1/10^14,reltol=1/10^14,saveat=dt);
X = ref_sol[:,:] DX = similar(X) for (i, xi) in enumerate(eachcol(X)) DX[:, i] = keen_simplified(xi, [], ref_sol.t[i]) end ts = ref_sol.t
ddprob = ContinuousDataDrivenProblem(ref_sol) ddprob = ContinuousDataDrivenProblem( X , ts, DX = DX[1:3, :] )
gives the error:
MethodError: objects of type ODESystem are not callable
Stacktrace: [1] top-level scope @ .\In[10]:4 [2] eval @ .\boot.jl:373 [inlined] [3] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String) @ Base .\loading.jl:1196