ChrisRackauckas / universal_differential_equations

Repository for the Universal Differential Equations for Scientific Machine Learning paper, describing a computational basis for high performance SciML
https://arxiv.org/abs/2001.04385
MIT License
219 stars 59 forks source link

Instability detected in Fisher-KPP example #35

Closed mattlie82 closed 3 years ago

mattlie82 commented 3 years ago

Hi, when running the Fisher-KPP example (Fisher-KPP-CNN.jl), everything works fine until the network is trained with BFGS. The code aborts due to a detected instability.

I guess it has something to do with ReverseDiffVJP?

Warning: Instability detected. Aborting
└ @ SciMLBase ~/.julia/packages/SciMLBase/9EjAY/src/integrator_interface.jl:351
┌ Warning: Instability detected. Aborting
└ @ SciMLBase ~/.julia/packages/SciMLBase/9EjAY/src/integrator_interface.jl:351
┌ Warning: Instability detected. Aborting
└ @ SciMLBase ~/.julia/packages/SciMLBase/9EjAY/src/integrator_interface.jl:351
┌ Warning: Instability detected. Aborting
└ @ SciMLBase ~/.julia/packages/SciMLBase/9EjAY/src/integrator_interface.jl:351
┌ Warning: Instability detected. Aborting
└ @ SciMLBase ~/.julia/packages/SciMLBase/9EjAY/src/integrator_interface.jl:351
┌ Warning: Instability detected. Aborting
└ @ SciMLBase ~/.julia/packages/SciMLBase/9EjAY/src/integrator_interface.jl:351
ERROR: LoadError: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 11 and 2")
Stacktrace:
  [1] _bcs1
    @ ./broadcast.jl:501 [inlined]
  [2] _bcs (repeats 2 times)
    @ ./broadcast.jl:495 [inlined]
  [3] broadcast_shape
    @ ./broadcast.jl:489 [inlined]
  [4] combine_axes
    @ ./broadcast.jl:484 [inlined]
  [5] instantiate
    @ ./broadcast.jl:266 [inlined]
  [6] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(-), Tuple{Matrix{Float64}, Matrix{Float64}}})
    @ Base.Broadcast ./broadcast.jl:883
  [7] loss_rd(p::Vector{Float64})
    @ Main ~/Documents/Workshops/FUHRI2021/P4-Machine-Learning-And-PDEs/universal_differential_equations/FisherKPP/Fisher-KPP-CNN.jl:142
  [8] (::DiffEqFlux.var"#69#70"{typeof(loss_rd)})(x::Vector{Float64}, p::SciMLBase.NullParameters)
    @ DiffEqFlux ~/.julia/packages/DiffEqFlux/alPQ3/src/train.jl:3
  [9] (::OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing})(::Vector{Float64}, ::Vararg{Any, N} where N) (repeats 2 times)
    @ SciMLBase ~/.julia/packages/SciMLBase/9EjAY/src/problems/basic_problems.jl:107
 [10] WARNING: both Iterators and Flux export "flatten"; uses of it in module GalacticOptim must be qualified
WARNING: both Flux and Tracker export "gradient"; uses of it in module GalacticOptim must be qualified
(::GalacticOptim.var"#17#25"{OptimizationProblem{false, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#146#156"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#149#159"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#154#164", Nothing, Nothing, Nothing}, Vector{Float64}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:cb, :maxiters), Tuple{var"#7#10", Int64}}}}, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#146#156"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#149#159"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#154#164", Nothing, Nothing, Nothing}, GalacticOptim.var"#146#156"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#149#159"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#154#164", Nothing, Nothing, Nothing}})(θ::Vector{Float64})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/JnLwV/src/solve.jl:165
 [11] (::GalacticOptim.var"#18#26"{GalacticOptim.var"#17#25"{OptimizationProblem{false, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#146#156"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#149#159"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#154#164", Nothing, Nothing, Nothing}, Vector{Float64}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:cb, :maxiters), Tuple{var"#7#10", Int64}}}}, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#146#156"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#149#159"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#154#164", Nothing, Nothing, Nothing}, GalacticOptim.var"#146#156"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#149#159"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#154#164", Nothing, Nothing, Nothing}}, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#146#156"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#149#159"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#154#164", Nothing, Nothing, Nothing}, GalacticOptim.var"#146#156"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#149#159"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#154#164", Nothing, Nothing, Nothing}})(G::Vector{Float64}, θ::Vector{Float64})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/JnLwV/src/solve.jl:173
 [12] value_gradient!!(obj::TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}, x::Vector{Float64})
    @ NLSolversBase ~/.julia/packages/NLSolversBase/geyh3/src/interface.jl:82
 [13] value_gradient!(obj::TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}, x::Vector{Float64})
    @ NLSolversBase ~/.julia/packages/NLSolversBase/geyh3/src/interface.jl:69
 [14] value_gradient!(obj::Optim.ManifoldObjective{TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}}, x::Vector{Float64})
    @ Optim ~/.julia/packages/Optim/uwNqi/src/Manifolds.jl:50
 [15] (::LineSearches.var"#ϕdϕ#6"{Optim.ManifoldObjective{TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}}, Vector{Float64}, Vector{Float64}, Vector{Float64}})(α::Float64)
    @ LineSearches ~/.julia/packages/LineSearches/Ki4c5/src/LineSearches.jl:84
 [16] secant2!(ϕdϕ::LineSearches.var"#ϕdϕ#6"{Optim.ManifoldObjective{TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}}, Vector{Float64}, Vector{Float64}, Vector{Float64}}, alphas::Vector{Float64}, values::Vector{Float64}, slopes::Vector{Float64}, ia::Int64, ib::Int64, phi_lim::Float64, delta::Float64, sigma::Float64, display::Int64)
    @ LineSearches ~/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:368
 [17] (::LineSearches.HagerZhang{Float64, Base.RefValue{Bool}})(ϕ::Function, ϕdϕ::LineSearches.var"#ϕdϕ#6"{Optim.ManifoldObjective{TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}}, Vector{Float64}, Vector{Float64}, Vector{Float64}}, c::Float64, phi_0::Float64, dphi_0::Float64)
    @ LineSearches ~/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:269
 [18] HagerZhang
    @ ~/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:101 [inlined]
 [19] perform_linesearch!(state::Optim.BFGSState{Vector{Float64}, Matrix{Float64}, Float64, Vector{Float64}}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat}, d::Optim.ManifoldObjective{TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}})
    @ Optim ~/.julia/packages/Optim/uwNqi/src/utilities/perform_linesearch.jl:59
 [20] update_state!(d::TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}, state::Optim.BFGSState{Vector{Float64}, Matrix{Float64}, Float64, Vector{Float64}}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat})
    @ Optim ~/.julia/packages/Optim/uwNqi/src/multivariate/solvers/first_order/bfgs.jl:129
 [21] optimize(d::TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}, initial_x::Vector{Float64}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat}, options::Optim.Options{Float64, GalacticOptim.var"#_cb#24"{var"#7#10", BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat}, Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}}}, state::Optim.BFGSState{Vector{Float64}, Matrix{Float64}, Float64, Vector{Float64}})
    @ Optim ~/.julia/packages/Optim/uwNqi/src/multivariate/optimize/optimize.jl:57
 [22] optimize(d::TwiceDifferentiable{Float64, Vector{Float64}, Matrix{Float64}, Vector{Float64}}, initial_x::Vector{Float64}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat}, options::Optim.Options{Float64, GalacticOptim.var"#_cb#24"{var"#7#10", BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat}, Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}}})
    @ Optim ~/.julia/packages/Optim/uwNqi/src/multivariate/optimize/optimize.jl:35
 [23] __solve(prob::OptimizationProblem{false, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#146#156"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#149#159"{GalacticOptim.var"#145#155"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#69#70"{typeof(loss_rd)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#154#164", Nothing, Nothing, Nothing}, Vector{Float64}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:cb, :maxiters), Tuple{var"#7#10", Int64}}}}, opt::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat}, data::Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}; maxiters::Int64, cb::Function, progress::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/JnLwV/src/solve.jl:182
 [24] #solve#468
    @ ~/.julia/packages/SciMLBase/9EjAY/src/solve.jl:3 [inlined]
 [25] sciml_train(::typeof(loss_rd), ::Vector{Float64}, ::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat}, ::GalacticOptim.AutoZygote; lower_bounds::Nothing, upper_bounds::Nothing, kwargs::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:cb, :maxiters), Tuple{var"#7#10", Int64}}})
    @ DiffEqFlux ~/.julia/packages/DiffEqFlux/alPQ3/src/train.jl:6
 [26] top-level scope
ChrisRackauckas commented 3 years ago

Is it BFGS(initial_stepnorm=0.01)? It seems like the starting linesearch may have failed.

mattlie82 commented 3 years ago

Yes, you are right. The initial step norm was not set. This solves the issue, the example is running.

By the way, what is the reason to use concrete_solve instead of solve?

ChrisRackauckas commented 3 years ago

It's just an old version.

mattlie82 commented 3 years ago

I'll submit a fix.