Open 00krishna opened 2 years ago
I was continuing to work on this problem. I tried to step through the error and thought this might be related to the AD backend. But it seems like even with ForwardDiff, the error still occurs.
result_ode_uode = DiffEqFlux.sciml_train(loss_uode,
pinit,
ADAM(0.1),
adtype=GalacticOptim.AutoZygote(),
cb = callback_display,
maxiters = 100)
Okay, this problems seems to also happen in the demo hudson_bay.jl
code after the first BFGS optimization call using the shooting_loss
.
Here is the code that I ran, and then the error message. It is identical to the error message I received when running my code in the OP.
## Environment and packages
cd(@__DIR__)
using Pkg; Pkg.activate("."); Pkg.instantiate()
using DifferentialEquations
using ModelingToolkit
using DataDrivenDiffEq
using LinearAlgebra, Optim
using DiffEqFlux
using Flux
using Plots
gr()
using JLD2, FileIO
using Statistics
using DelimitedFiles
# Set a random seed for reproduceable behaviour
using Random
Random.seed!(5443)
#### NOTE
# Since the recent release of DataDrivenDiffEq v0.6.0 where a complete overhaul of the optimizers took
# place, SR3 has been used. Right now, STLSQ performs better and has been changed.
# Additionally, the behaviour of the optimization has changed slightly. This has been adjusted
# by decreasing the tolerance of the gradient.
svname = "HudsonBay"
## Data Preprocessing
# The data has been taken from https://jmahaffy.sdsu.edu/courses/f00/math122/labs/labj/q3v1.htm
# Originally published in E. P. Odum (1953), Fundamentals of Ecology, Philadelphia, W. B. Saunders
hudson_bay_data = readdlm("hudson_bay_data.dat", '\t', Float32, '\n')
# Measurements of prey and predator
Xₙ = Matrix(transpose(hudson_bay_data[:, 2:3]))
t = hudson_bay_data[:, 1] .- hudson_bay_data[1, 1]
# Normalize the data; since the data domain is strictly positive
# we just need to divide by the maximum
xscale = maximum(Xₙ, dims =2)
Xₙ .= 1f0 ./ xscale .* Xₙ
# Time from 0 -> n
tspan = (t[1], t[end])
# Plot the data
scatter(t, transpose(Xₙ), xlabel = "t", ylabel = "x(t), y(t)")
plot!(t, transpose(Xₙ), xlabel = "t", ylabel = "x(t), y(t)")
## Direct Identification via SINDy + Collocation
# Create the problem using a gaussian kernel for collocation
full_problem = ContinuousDataDrivenProblem(Xₙ, t, DataDrivenDiffEq.GaussianKernel())
# Look at the collocation
plot(full_problem.t, full_problem.X')
plot(full_problem.t, full_problem.DX')
# Create a Basis
@variables u[1:2]
# Generate the basis functions, multivariate polynomials up to deg 5
# and sine
b = [polynomial_basis(u, 5); sin.(u)]
basis = Basis(b, u)
# Create the thresholds which should be used in the search process
λ = Float32.(exp10.(-7:0.1:5))
# Create an optimizer for the SINDy problem
opt = STLSQ(λ)
# Best result so far
full_res = solve(full_problem, basis, opt, maxiter = 10000, progress = true, denoise = true, normalize = true)
println(full_res)
println(result(full_res))
## Define the network
# Gaussian RBF as activation
rbf(x) = exp.(-(x.^2))
# Define the network 2->5->5->5->2
U = FastChain(
FastDense(2,5,rbf), FastDense(5,5, rbf), FastDense(5,5, tanh), FastDense(5,2)
)
# Get the initial parameters, first two is linear birth / decay of prey and predator
p = [rand(Float32,2); initial_params(U)]
# Define the hybrid model
function ude_dynamics!(du,u, p, t)
û = U(u, p[3:end]) # Network prediction
# We assume a linear birth rate for the prey
du[1] = p[1]*u[1] + û[1]
# We assume a linear decay rate for the predator
du[2] = -p[2]*u[2] + û[2]
end
# Define the problem
prob_nn = ODEProblem(ude_dynamics!,Xₙ[:, 1], tspan, p)
## Function to train the network
# Define a predictor
function predict(θ, X = Xₙ[:,1], T = t)
Array(solve(prob_nn, Vern7(), u0 = X, p=θ,
tspan = (T[1], T[end]), saveat = T,
abstol=1e-6, reltol=1e-6,
sensealg = ForwardDiffSensitivity()
))
end
# Define parameters for Multiple Shooting
group_size = 5
continuity_term = 200.0f0
function loss(data, pred)
return sum(abs2, data - pred)
end
function shooting_loss(p)
return multiple_shoot(p, Xₙ, t, prob_nn, loss, Vern7(),
group_size; continuity_term)
end
function loss(θ)
X̂ = predict(θ)
sum(abs2, Xₙ - X̂) / size(Xₙ, 2) + convert(eltype(θ), 1e-3)*sum(abs2, θ[3:end]) ./ length(θ[3:end])
end
# Container to track the losses
losses = Float32[]
# Callback to show the loss during training
callback(θ,args...) = begin
l = loss(θ) # Equivalent L2 loss
push!(losses, l)
if length(losses)%5==0
println("Current loss after $(length(losses)) iterations: $(losses[end])")
end
false
end
## Training -> First shooting / batching to get a rough estimate
# First train with ADAM for better convergence -> move the parameters into a
# favourable starting positing for BFGS
res1 = DiffEqFlux.sciml_train(shooting_loss, p, ADAM(0.1f0), cb=callback, maxiters = 100)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
# Train with BFGS to achieve partial fit of the data
res2 = DiffEqFlux.sciml_train(shooting_loss, res1.minimizer, BFGS(initial_stepnorm=0.01f0), cb=callback, maxiters = 500) <---- THIS IS WHERE THE ERROR IS OCCURRING.
Here is the error message.
ERROR: MethodError: Cannot `convert` an object of type Nothing to an object of type Float32
Closest candidates are:
convert(::Type{T}, ::Static.StaticFloat64) where T<:AbstractFloat at ~/.julia/packages/Static/R0QTo/src/float.jl:22
convert(::Type{T}, ::LLVM.GenericValue, ::LLVM.LLVMType) where T<:AbstractFloat at ~/.julia/packages/LLVM/tVv0H/src/execution.jl:39
convert(::Type{T}, ::LLVM.ConstantFP) where T<:AbstractFloat at ~/.julia/packages/LLVM/tVv0H/src/core/value/constant.jl:103
...
Stacktrace:
[1] fill!(dest::Vector{Float32}, x::Nothing)
@ Base ./array.jl:351
[2] copyto!
@ ./broadcast.jl:921 [inlined]
[3] materialize!
@ ./broadcast.jl:871 [inlined]
[4] materialize!(dest::Vector{Float32}, bc::Base.Broadcast.Broad)
@ Base.Broadcast ./broadcast.jl:868
[5] (::GalacticOptim.var"#268#278"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}})(::Vector{Float32}, ::Vector{Float32})
@ GalacticOptim ~/.julia/packages/GalacticOptim/fow0r/src/function/zygote.jl:8
[6] (::GalacticOptim.var"#144#152"{OptimizationProblem{false, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#268#278"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#271#281"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#276#286", Nothing, Nothing, Nothing}, Vector{Float32}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Pairs{Symbol, typeof(callback), Tuple{Symbol}, NamedTuple{(:cb,), Tuple{typeof(callback)}}}}, GalacticOptim.var"#143#151"{OptimizationProblem{false, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#268#278"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#271#281"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#276#286", Nothing, Nothing, Nothing}, Vector{Float32}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Pairs{Symbol, typeof(callback), Tuple{Symbol}, NamedTuple{(:cb,), Tuple{typeof(callback)}}}}, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#268#278"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#271#281"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#276#286", Nothing, Nothing, Nothing}, GalacticOptim.var"#268#278"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#271#281"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#276#286", Nothing, Nothing, Nothing}}, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#268#278"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#271#281"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#276#286", Nothing, Nothing, Nothing}, GalacticOptim.var"#268#278"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#271#281"{GalacticOptim.var"#267#277"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#84#89"{typeof(shooting_loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#276#286", Nothing, Nothing, Nothing}})(G::Vector{Float32}, θ::Vector{Float32})
@ GalacticOptim ~/.julia/packages/GalacticOptim/fow0r/src/solve/optim.jl:93
[7] value_gradient!!(obj::TwiceDifferentiable{, x::Vector{Float32})
@ NLSolversBase ~/.julia/packages/NLSolversBase/cfJrN/src/interface.jl:82
[8] value_gradient!(obj::TwiceDifferentiable{, x::Vector{Float32})
@ NLSolversBase ~/.julia/packages/NLSolversBase/cfJrN/src/interface.jl:69
[9] value_gradient!(obj::Optim.ManifoldObject, x::Vector{Float32})
@ Optim ~/.julia/packages/Optim/wFOeG/src/Manifolds.jl:50
[10] (::LineSearches.var"#ϕdϕ#6"{Optim.ManifoldObjective{TwiceDifferentiable{Float32, Vector{Float32}, Matrix{Float32}, Vector{Float32}}}, Vector{Float32}, Vector{Float32}, Vector{Float32}})(α::Float32)
@ LineSearches ~/.julia/packages/LineSearches/Ki4c5/src/LineSearches.jl:84
[11] (::LineSearches.HagerZhang{Float64, Base.RefValue{Bool}})(ϕ::Function, ϕdϕ::LineSearches.var"#ϕd, c::Float32, phi_0::Float32, dphi_0::Float32)
@ LineSearches ~/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:139
[12] HagerZhang
@ ~/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:101 [inlined]
[13] perform_linesearch!(state::Optim.BFGSState{Vect, method::BFGS{LineSearches.In, d::Optim.ManifoldObject)
@ Optim ~/.julia/packages/Optim/wFOeG/src/utilities/perform_linesearch.jl:59
[14] update_state!(d::TwiceDifferentiable{, state::Optim.BFGSState{Vect, method::BFGS{LineSearches.In)
@ Optim ~/.julia/packages/Optim/wFOeG/src/multivariate/solvers/first_order/bfgs.jl:139
[15] optimize(d::TwiceDifferentiable{, initial_x::Vector{Float32}, method::BFGS{LineSearches.In, options::Optim.Options{Float6, state::Optim.BFGSState{Vect)
@ Optim ~/.julia/packages/Optim/wFOeG/src/multivariate/optimize/optimize.jl:54
[16] optimize(d::TwiceDifferentiable{, initial_x::Vector{Float32}, method::BFGS{LineSearches.In, options::Optim.Options{Float6)
@ Optim ~/.julia/packages/Optim/wFOeG/src/multivariate/optimize/optimize.jl:36
[17] ___solve(prob::OptimizationProblem{, opt::BFGS{LineSearches.In, data::Base.Iterators.Cycle; cb::Function, maxiters::Int64, maxtime::Nothing, abstol::Nothing, reltol::Nothing, progress::Bool, kwargs::Base.Pairs{Symbol, U)
@ GalacticOptim ~/.julia/packages/GalacticOptim/fow0r/src/solve/optim.jl:129
[18] #__solve#141
@ ~/.julia/packages/GalacticOptim/fow0r/src/solve/optim.jl:49 [inlined]
[19] #solve#482
@ ~/.julia/packages/SciMLBase/nbKmA/src/solve.jl:3 [inlined]
[20] sciml_train(::typeof(shooting_loss, ::Vector{Float32}, ::BFGS{LineSearches.In, ::Nothing; lower_bounds::Nothing, upper_bounds::Nothing, maxiters::Int64, kwargs::Base.Pairs{Symbol, t)
@ DiffEqFlux ~/.julia/packages/DiffEqFlux/gH716/src/train.jl:91
[21] top-level scope
@ ~/Dropbox/sandbox/julia/universal_differential_equations/LotkaVolterra/hudson_bay.jl:146
@AlCap23 hopefully you can see this reference. Here is the issue that I mentioned to Chris. For some reason the hudson_bay.jl
optimization is failing on the first BFGS run. The error message I received above was about Cannot convert an object of type Nohting to an object of type Float32
, but I think the real issue is something else. Interestingly, I wondering if it has to do with the ForwardDiffSensitivity()
for the sensealg
, since both my own code and the hudson_bay code did that. We can chat about it if you like.
The problem seems to be with the choice of AD backend. I had to manually set the AD backend to AutoForwardDiff
and then it worked. When I used the default AutoZygote
or even AutoTracker
the code failed with the error message above.
Here is the updated code.
using GalacticOptim
adtype = GalacticOptim.AutoForwardDiff()
res1 = DiffEqFlux.sciml_train(shooting_loss, p, ADAM(0.1f0), adtype, cb=callback, maxiters = 100)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
# Train with BFGS to achieve partial fit of the data
res2 = DiffEqFlux.sciml_train(shooting_loss, res1.minimizer, BFGS(initial_stepnorm=0.005f0), adtype, cb=callback, maxiters = 500)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
# Full L2-Loss for full prediction
res3 = DiffEqFlux.sciml_train(loss, res2.minimizer, BFGS(initial_stepnorm=0.01f0), cb=callback, maxiters = 10000)
println("Final training loss after $(length(losses)) iterations: $(losses[end])")
This is a change from the original code. I am waiting to hear back from Chris and Julius to see if they want me to just adjust the UODE repo code, or if they want to dive in deeper to figure out what is the cause of the problem with Zygote and Tracker, ...
What's the reason for mixing Float64 and Float32? What happens if you make it all the same type?
@ChrisRackauckas , @AlCap23 and I have been looking at this. So far the error seems to be coming from GalacticOptim
and potentially these functions:
Julius thinks that one of the dispatches is not working as expected.
I can try and see what happens if we explicitly set everything to Float32 or Float64. We are continuing to trace the problem, but I will update the issue as we figure it out.
Okay, I'm travelling and a bit behind so I'm going to archive this assuming @AlCap23 has it under control, but if that's not the case just ping me.
What's the current status here?
Have have not found the root of the problem yet. We were thinking it has something to do with the sequence with which the AD rules are being applied or maybe like the type that is getting identified for dispatch. I was wondering if perhaps @avik-pal might help us figure this out, as he seems to have encountered a similar error last week. So I figured I would ping him.
Hello. I was training a simple UODE and am encountering this error when running
sciml_train()
. The error seems to be somewhere in the interface betweenDiffEqFlux
andGalacticOptim
. I have the specific error message, and some code to replicate the problem.Here is the error message text:
Here is the code to replicate the problem. I hardcoded some data so that anyone can precisely replicate the issue. The code itself is based on some of the examples in the Universal ODE github repo.