ChrisRackauckas / universal_differential_equations

Repository for the Universal Differential Equations for Scientific Machine Learning paper, describing a computational basis for high performance SciML
https://arxiv.org/abs/2001.04385
MIT License
215 stars 59 forks source link

UDE giving Linear Approximation / NeuralODEMM problems #43

Open ghost opened 1 year ago

ghost commented 1 year ago

I also had a similar issue to this for the first part of the SEIR example a few weeks back, though the second part seemed to work.

I copied the example here: github.com [ChrisRackauckas/universal_differential_equations/blob/master/LotkaVolterra/scenario_1.jl]

And put in my own function. Everything else is the same except for I didn’t add noise.

U = Lux.Chain(
    Lux.Dense(5,3,rbf), Lux.Dense(3,3, rbf), Lux.Dense(3,3, rbf), Lux.Dense(3,5)
)
# Get the initial parameters and state variables of the model
p, st = Lux.setup(rng, U)

function ude_dynamics!(du, u, p, t, p_true)
    û = U(u, p, st)[1] # Network prediction
    p1, p2, p3, p4, p5, p6, p7, p8, p9 = p
    a, b, c, d, e = u
    da = û[1]
    db = c - p8 - (p3 + p4*p9 + (-p5*e) / a)*p9*a
    dc = p6*((p3 + p4*p9 + (-p5*e) / a)*p9*a + d + p8) - c
    dd = p7*(p2*a + p1*e) - d
    de = (p3 + p4*p9 + (-p5*e) / a)*p9*a + d + p8 - c - e
    du[1] = da; du[2] = db; du[3] = dc; du[4] = dd; du[5] = de
end

#Initial system: 
da = (p3 + p4*p9 + (-p5*e) / a)*p9*a + p8 - c
tspan = (0.0,90.0)
u0 = Float32[86.5,21.62,21.62,86.5,86.5]
p_ = Float32[0.6,0.4,0.635,5,0.01,0.2,0.3,20,0.025]

However, it’s giving me a linear approximation for da, though BFGS finishes way before max iterations.

image

I have tried with multiple solvers with the above: Vern7 and KenCarp4 works but gives a linear approximation for da Tsit5 aborts as unstable when BFGS runs (this solved the original ODE) Tried a few others, Rosenbrock23 and Rodas4 but these don’t run at all.

I tried running it with 36 in the chain, it finished ADAM then hung all night on BFGS.

The original system was a DAE which I used ModelingToolkit structural_simplify(dae_index_lowering) on then got the full equations, which eliminated the algebraic constraints. I tried using a mass matrix with no constraints and then use NeuralODEMM in case this was the issue. The example I was working from for this is here: Readme · DiffEqFlux.jl (juliahub.com) However, it's so different from the Lotka Volterra model that I had difficulty translating between the two.

#For original system:
M = [1.0 0 0 0 0
     0 1.0 0 0 0
     0 0 1.0 0 0
     0 0 0 1.0 0
     0 0 0 0 1.0]
#######
# Closure with the known parameter
nn_dynamics!(du,u,p,t) = ude_dynamics!(du,u,p,t,p_)
# Define the problem
prob_nn = NeuralODEMM(nn_dynamics!,X[:, 1], tspan, M, p)

## Function to train the network
# Define a predictor
function predict(θ, X = X[:,1], T = t)
    Array(solve(prob_nn, Rodas5(autodiff=false), u0 = X, p=θ,
                saveat = T,
                abstol=1e-8, reltol=1e-8,
                sensealg = ForwardDiffSensitivity()
                ))
end

I got a really long error, the first part was: ERROR: MethodError: no method matching init(::NeuralODEMM{typeof(nn_dynamics!), Vector{Float32}, Vector{Bool}, Optimisers.Restructure{typeof(nn_dynamics!), Tuple{}}, Tuple{Float64, Float64}, Matrix{Float64}, Tuple{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), NTuple{4, NamedTuple{(:weight, :bias), Tuple{Matrix{Float32}, Matrix{Float32}}}}}}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, ::Rodas5{0, false, Nothing, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}; u0=Float32[86.5, 21.62, 21.62, 86.5, 86.5], p=(layer_1 = (weight = Float32[-0.54 ...

I then tried to go back to the original DAE and make it work with the NeuralODEMM example here. Readme · DiffEqFlux.jl (juliahub.com) The example works fine, but when I type mine in, it tells me “a” or “p1” etc are not recognised. I put it in u[1], p[1] format, and it tells me u is not recognised. I tried all kinds of brackets in case the problem is that I have more than one constraint. I tried deleting 2 of the constraints. I'm not sure what is different from the example.

function f(du,u,p,t)
    a, b, c, d, e, f, g, h = u
    p1, p2, p3, p4, p5, p6, p7, p8, p9 = p
    du[1] = (p8*f) + p9 - c 
    du[2] =  c + d - (p8*f) - g  
    du[3] =  p6*(g + (p8*f)) - c
    du[4] = (p1*e + p2*a)*p7 - d
    du[5] = g + p8*f - c - e
    du[6] = a*(p3 + p4*p8 - p5*(e/a)) - f
    du[7] = d + p9 - g
    du[8] = a - f - h
end

#...
ndae = NeuralODEMM(dudt2, (u,p,t) -> [a*p3 + p4*p8 - p5*(e/a)) - f],[d + p9 - g],[a - f - h], tspan, M, Rodas5(autodiff=false),saveat=0.1)