ChrisRackauckas / universal_differential_equations

Repository for the Universal Differential Equations for Scientific Machine Learning paper, describing a computational basis for high performance SciML
https://arxiv.org/abs/2001.04385
MIT License
215 stars 59 forks source link

Hudson Bay #45

Open ghost opened 1 year ago

ghost commented 1 year ago

I couldn't find an example of multiple shooting in SciMLSensitivity, sorry - it's coming up with the error "UndefVarError: multiple_shoot not defined" when I try to run ADAM.

I also tried FENEP but it seems like there's some issue with Tracked Arrays, here: https://github.com/SciML/SciMLSensitivity.jl/issues/609 And also with Flux, because the original doesn't work with it. I get the error: UndefVarError: TrackedArray not defined So i'll leave this one.

using OrdinaryDiffEq
using ModelingToolkit
using DataDrivenDiffEq
using LinearAlgebra 
using SciMLSensitivity
using Random
using Optimization, OptimizationOptimisers, OptimizationOptimJL #OptimizationOptimisers for ADAM and OptimizationOptimJL for BFGS
using Lux
using Statistics
using JLD2, FileIO
using DelimitedFiles
using Plots
gr()

Random.seed!(5443)

#### NOTE
# Since the recent release of DataDrivenDiffEq v0.6.0 where a complete overhaul of the optimizers took
# place, SR3 has been used. Right now, STLSQ performs better and has been changed.
# Additionally, the behaviour of the optimization has changed slightly. This has been adjusted
# by decreasing the tolerance of the gradient.

svname = "HudsonBay"
## Data Preprocessing
# The data has been taken from https://jmahaffy.sdsu.edu/courses/f00/math122/labs/labj/q3v1.htm
# Originally published in E. P. Odum (1953), Fundamentals of Ecology, Philadelphia, W. B. Saunders
hudson_bay_data = readdlm("hudson_bay_data.dat", '\t', Float32, '\n')
# Measurements of prey and predator
Xₙ = Matrix(transpose(hudson_bay_data[:, 2:3]))
t = hudson_bay_data[:, 1] .- hudson_bay_data[1, 1]
# Normalize the data; since the data domain is strictly positive
# we just need to divide by the maximum
xscale = maximum(Xₙ, dims =2)
Xₙ .= 1f0 ./ xscale .* Xₙ
# Time from 0 -> n
tspan = (t[1], t[end])

# Plot the data
scatter(t, transpose(Xₙ), xlabel = "t", ylabel = "x(t), y(t)")
plot!(t, transpose(Xₙ), xlabel = "t", ylabel = "x(t), y(t)")

## Direct Identification via SINDy + Collocation

# Create the problem using a gaussian kernel for collocation
full_problem = ContinuousDataDrivenProblem(Xₙ, t, DataDrivenDiffEq.GaussianKernel())
# Look at the collocation
plot(full_problem.t, full_problem.X')
plot(full_problem.t, full_problem.DX')

# Create a Basis
@variables u[1:2]

# Generate the basis functions, multivariate polynomials up to deg 5
# and sine
b = [polynomial_basis(u, 5); sin.(u)]
basis = Basis(b, u)

# Create the thresholds which should be used in the search process
λ = Float32.(exp10.(-7:0.1:5))
# Create an optimizer for the SINDy problem
opt = STLSQ(λ)

# Best result so far
full_res = solve(full_problem, basis, opt, maxiter = 10000, progress = true, denoise = true, normalize = true)

println(full_res)
println(result(full_res))

## Define the network
# Gaussian RBF as activation
rbf(x) = exp.(-(x.^2))

# Define the network 2->5->5->5->2
U = Lux.Chain(
    Lux.Dense(2,5,rbf), Lux.Dense(5,5, rbf), Lux.Dense(5,5, tanh), Lux.Dense(5,2)
)

# Get the initial parameters, first two is linear birth / decay of prey and predator
rng = Random.default_rng()
p, st = Lux.setup(rng, U)

# Define the hybrid model
function ude_dynamics!(du,u, p, t)
    û = U(u, p[3:end]) # Network prediction
    # We assume a linear birth rate for the prey
    du[1] = p[1]*u[1] + û[1]
    # We assume a linear decay rate for the predator
    du[2] = -p[2]*u[2] + û[2]
end

# Define the problem
prob_nn = ODEProblem(ude_dynamics!,Xₙ[:, 1], tspan, p)

## Function to train the network
# Define a predictor
function predict(θ, X = Xₙ[:,1], T = t)
    Array(solve(prob_nn, Vern7(), u0 = X, p=θ,
                tspan = (T[1], T[end]), saveat = T,
                abstol=1e-6, reltol=1e-6,
                sensealg = ForwardDiffSensitivity()
                ))
end

# Define parameters for Multiple Shooting
group_size = 5
continuity_term = 200.0f0

function loss(data, pred)
    return sum(abs2, data - pred)
end

function shooting_loss(p)
    return multiple_shoot(p, Xₙ, t, prob_nn, loss, Vern7(),
                          group_size; continuity_term)
end

function loss(θ)
    X̂ = predict(θ)
    sum(abs2, Xₙ - X̂) / size(Xₙ, 2) + convert(eltype(θ), 1e-3)*sum(abs2, θ[3:end]) ./ length(θ[3:end])
end

# Container to track the losses
losses = Float32[]

# Callback to show the loss during training
callback(θ,args...) = begin
    l = loss(θ) # Equivalent L2 loss
    push!(losses, l)
    if length(losses)%5==0
        println("Current loss after $(length(losses)) iterations: $(losses[end])")
    end
    false
end

## Training -> First shooting / batching to get a rough estimate

# First train with ADAM for better convergence -> move the parameters into a
# favourable starting positing for BFGS
adtype = Optimization.AutoZygote()
optf = Optimization.OptimizationFunction((x,p)->shooting_loss(x), adtype)
optprob = Optimization.OptimizationProblem(optf, Lux.ComponentArray(p))
res1 = Optimization.solve(optprob, ADAM(0.1f0), cb=callback, maxiters = 100)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
# Train with BFGS to achieve partial fit of the data
optprob2 = Optimization.OptimizationProblem(optf, res1.minimizer)
res2 = Optimization.solve(optprob2, Optim.BFGS(initial_stepnorm=0.01f0), cb=callback, maxiters = 500)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
# Full L2-Loss for full prediction

optf2 = Optimization.OptimizationFunction((x,p)->loss(x), adtype)
optprob2 = Optimization.OptimizationProblem(optf2, res2.minimizer)
res3 = Optimization.solve(optprob2, Optim.BFGS(initial_stepnorm=0.01f0), cb=callback, maxiters = 10000)
println("Final training loss after $(length(losses)) iterations: $(losses[end])")

pl_losses = plot(1:101, losses[1:101], yaxis = :log10, xaxis = :log10, xlabel = "Iterations", ylabel = "Loss", label = "ADAM (Shooting)", color = :blue)
plot!(102:302, losses[102:302], yaxis = :log10, xaxis = :log10, xlabel = "Iterations", ylabel = "Loss", label = "BFGS (Shooting)", color = :red)
plot!(302:length(losses), losses[302:end], color = :black, label = "BFGS (L2)")
savefig(pl_losses, "plot_losses.png"))
ghost commented 1 year ago

Oh, I found a working example in DiffEqFlux that works with Lux & Optimization. Is DiffEqFlux being kept separate? Or it just works there till it's moved over? https://diffeqflux.sciml.ai/stable/examples/multiple_shooting/

Despite this, I get a no method matching... error when I add DiffEqFlux" and try to use ADAM: MethodError: no method matching (::Lux.Chain{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), Tuple{Lux.Dense{true, typeof(rbf), typeof(Lux.glorot_uniform), typeof(Lux.zeros32)}, Lux.Dense{true, typeof(rbf), typeof(Lux.glorot_uniform), typeof(Lux.zeros32)}, Lux.Dense{true, typeof(tanh_fast), typeof(Lux.glorot_uniform), typeof(Lux.zeros32)}, Lux.Dense{true, typeof(identity), typeof(Lux.glorot_uniform), typeof(Lux.zeros32)}}}})(::Vector{Float32}, ::Vector{Float32})...

And even if I try to use PolyOpt, same. This line of the model also appears in red here: û = U(u, p[3:end]) # Network prediction

But I don't know why that would be an issue, so I will stop here.

ghost commented 1 year ago

Well, I moved onto LV Scenario 2, and then thought it could be that I didn't add st û = U(u, p[3:end], st) or [1] û = U(u, p[3:end], st)[1] like worked in the LV Scenario 1 but then I get ERROR: type Array has no field layer_1

I have the exact same problem with Scenario 2. No method matching... then Array has no field layer if I try to add st and [1].

ChrisRackauckas commented 1 year ago

@rajdandekar

ChrisRackauckas commented 1 year ago

Also @AlCap23

RajDandekar commented 1 year ago

Oh, I found a working example in DiffEqFlux that works with Lux & Optimization. Is DiffEqFlux being kept separate? Or it just works there till it's moved over? https://diffeqflux.sciml.ai/stable/examples/multiple_shooting/

Despite this, I get a no method matching... error when I add DiffEqFlux" and try to use ADAM: MethodError: no method matching (::Lux.Chain{NamedTuple{(:layer_1, :layer_2, :layer_3, :layer_4), Tuple{Lux.Dense{true, typeof(rbf), typeof(Lux.glorot_uniform), typeof(Lux.zeros32)}, Lux.Dense{true, typeof(rbf), typeof(Lux.glorot_uniform), typeof(Lux.zeros32)}, Lux.Dense{true, typeof(tanh_fast), typeof(Lux.glorot_uniform), typeof(Lux.zeros32)}, Lux.Dense{true, typeof(identity), typeof(Lux.glorot_uniform), typeof(Lux.zeros32)}}}})(::Vector{Float32}, ::Vector{Float32})...

And even if I try to use PolyOpt, same. This line of the model also appears in red here: û = U(u, p[3:end]) # Network prediction

But I don't know why that would be an issue, so I will stop here.

@ccrnn You are getting this error because you are trying to access the parameters as p[1], p[3:end] etc inside the ude_dynamics function.

What I suggest you do is the follows. This should work:

# Define the network 2->5->5->5->2
U = Lux.Chain(
    Lux.Dense(2,5,rbf), Lux.Dense(5,5, rbf), Lux.Dense(5,5, tanh), Lux.Dense(5,2)
)

rng = Random.default_rng()
p1, st = Lux.setup(rng, U)

#for birth, decay parameters -> initializing random values.
parameter_array = Float64[0.5, 0.5]
p = (layer_1 = p1, layer_2 = parameter_array)
p = Lux.ComponentArray(p)

# Define the hybrid model
function ude_dynamics!(du,u, p, t)
    û = U(u, p.layer_1, st)[1] # Network prediction
    # We assume a linear birth rate for the prey
    du[1] = p.layer_2[1]*u[1] + û[1]
    # We assume a linear decay rate for the predator
    du[2] = -p.layer_2[2]*u[2] + û[2]
end
ghost commented 1 year ago

@RajDandekar Thanks, that works! So the next problem here is at the end, beginning at: "# Define the recovered, hyrid model with the rescaled dynamics" When I try to put the function in the û = nn_res(u, p.layer_1, st)[1] format, it gives an error when I run sys = modelingtoolkitize(estimation_prob);

If I don't put it in that format it will run, seems to work, but the final long term estimate is way out. There's also a problem with the rough estimation.

using OrdinaryDiffEq
using ModelingToolkit
using DataDrivenDiffEq
using LinearAlgebra 
using SciMLSensitivity
using Random
using Optimization, OptimizationOptimisers, OptimizationOptimJL #OptimizationOptimisers for ADAM and OptimizationOptimJL for BFGS
using Lux
using Statistics
using JLD2, FileIO
using DelimitedFiles
using Plots
using DiffEqFlux
gr()
using OptimizationPolyalgorithms
Random.seed!(5443)

#### NOTE
# Since the recent release of DataDrivenDiffEq v0.6.0 where a complete overhaul of the optimizers took
# place, SR3 has been used. Right now, STLSQ performs better and has been changed.
# Additionally, the behaviour of the optimization has changed slightly. This has been adjusted
# by decreasing the tolerance of the gradient.

svname = "HudsonBay"
## Data Preprocessing
# The data has been taken from https://jmahaffy.sdsu.edu/courses/f00/math122/labs/labj/q3v1.htm
# Originally published in E. P. Odum (1953), Fundamentals of Ecology, Philadelphia, W. B. Saunders
hudson_bay_data = readdlm("hudson_bay_data.dat", '\t', Float32, '\n')
# Measurements of prey and predator
Xₙ = Matrix(transpose(hudson_bay_data[:, 2:3]))
t = hudson_bay_data[:, 1] .- hudson_bay_data[1, 1]
# Normalize the data; since the data domain is strictly positive
# we just need to divide by the maximum
xscale = maximum(Xₙ, dims =2)
Xₙ .= 1f0 ./ xscale .* Xₙ
# Time from 0 -> n
tspan = (t[1], t[end])

# Plot the data
scatter(t, transpose(Xₙ), xlabel = "t", ylabel = "x(t), y(t)")
plot!(t, transpose(Xₙ), xlabel = "t", ylabel = "x(t), y(t)")

## Direct Identification via SINDy + Collocation

# Create the problem using a gaussian kernel for collocation
full_problem = ContinuousDataDrivenProblem(Xₙ, t, DataDrivenDiffEq.GaussianKernel())
# Look at the collocation
plot(full_problem.t, full_problem.X')
plot(full_problem.t, full_problem.DX')

# Create a Basis
@variables u[1:2]

# Generate the basis functions, multivariate polynomials up to deg 5
# and sine
b = [polynomial_basis(u, 5); sin.(u)]
basis = Basis(b, u)

# Create the thresholds which should be used in the search process
λ = Float32.(exp10.(-7:0.1:5))
# Create an optimizer for the SINDy problem
opt = STLSQ(λ)

# Best result so far
full_res = solve(full_problem, basis, opt, maxiter = 10000, progress = true, denoise = true, normalize = true)

println(full_res)
println(result(full_res))

## Define the network
# Gaussian RBF as activation
rbf(x) = exp.(-(x.^2))

# Define the network 2->5->5->5->2
U = Lux.Chain(
    Lux.Dense(2,5,rbf), Lux.Dense(5,5, rbf), Lux.Dense(5,5, tanh), Lux.Dense(5,2)
)

rng = Random.default_rng()
p1, st = Lux.setup(rng, U)

#for birth, decay parameters -> initializing random values.
parameter_array = Float64[0.5, 0.5]
p = (layer_1 = p1, layer_2 = parameter_array)
p = Lux.ComponentArray(p)

# Define the hybrid model
function ude_dynamics!(du,u, p, t)
    û = U(u, p.layer_1, st)[1] # Network prediction
    # We assume a linear birth rate for the prey
    du[1] = p.layer_2[1]*u[1] + û[1]
    # We assume a linear decay rate for the predator
    du[2] = -p.layer_2[2]*u[2] + û[2]
end

# Define the problem
prob_nn = ODEProblem(ude_dynamics!,Xₙ[:, 1], tspan, p)

## Function to train the network
# Define a predictor
function predict(θ)
    Array(solve(prob_nn, Vern7(), u0 = Xₙ[:, 1], p=θ,
                saveat = t,
                abstol=1e-6, reltol=1e-6,
                sensealg = ForwardDiffSensitivity()
                ))
end

# Define parameters for Multiple Shooting
group_size = 5
continuity_term = 200.0f0

function loss(data, pred)
    return sum(abs2, data - pred)
end

function shooting_loss(p)
    return multiple_shoot(p, Xₙ, t, prob_nn, loss, Vern7(),
                          group_size; continuity_term)
end

function loss(θ)
    X̂ = predict(θ)
    sum(abs2, Xₙ - X̂) / size(Xₙ, 2) + convert(eltype(θ), 1e-3)*sum(abs2, θ[3:end]) ./ length(θ[3:end])
end

# Container to track the losses
losses = Float32[]

# Callback to show the loss during training
callback(θ,args...) = begin
    l = loss(θ) # Equivalent L2 loss
    push!(losses, l)
    if length(losses)%5==0
        println("Current loss after $(length(losses)) iterations: $(losses[end])")
    end
    false
end

## Training -> First shooting / batching to get a rough estimate

# First train with ADAM for better convergence -> move the parameters into a
# favourable starting positing for BFGS
adtype = Optimization.AutoZygote()
optf = Optimization.OptimizationFunction((x,p)->shooting_loss(x), adtype)
optprob = Optimization.OptimizationProblem(optf, Lux.ComponentArray(p))
res1 = Optimization.solve(optprob, ADAM(0.1f0), callback=callback, maxiters = 100)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
# Train with BFGS to achieve partial fit of the data
optprob2 = remake(optprob,u0 = res1.u)
res2 = Optimization.solve(optprob2, Optim.BFGS(initial_stepnorm=0.01f0), callback=callback, maxiters = 500)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
# Full L2-Loss for full prediction

optf2 = Optimization.OptimizationFunction((x,p)->loss(x), adtype)
optprob2 = Optimization.OptimizationProblem(optf2, res2.minimizer)
res3 = Optimization.solve(optprob2, Optim.BFGS(initial_stepnorm=0.01f0), callback=callback, maxiters = 10000)
println("Final training loss after $(length(losses)) iterations: $(losses[end])")

pl_losses = plot(1:101, losses[1:101], yaxis = :log10, xaxis = :log10, xlabel = "Iterations", ylabel = "Loss", label = "ADAM (Shooting)", color = :blue)
plot!(102:302, losses[102:302], yaxis = :log10, xaxis = :log10, xlabel = "Iterations", ylabel = "Loss", label = "BFGS (Shooting)", color = :red)
plot!(302:length(losses), losses[302:end], color = :black, label = "BFGS (L2)")
savefig(pl_losses, "plot_losses.png")

# Rename the best candidate
p_trained = res3.minimizer

## Analysis of the trained network
# Interpolate the solution
tsample = t[1]:0.5:t[end]
X̂ = predict(p_trained, Xₙ[:,1], tsample)
# Trained on noisy data vs real solution
pl_trajectory = scatter(t, transpose(Xₙ), color = :black, label = ["Measurements" nothing], xlabel = "t", ylabel = "x(t), y(t)")
plot!(tsample, transpose(X̂), color = :red, label = ["UDE Approximation" nothing])
savefig(pl_trajectory, "plot_trajectory_reconstruction.png")

# Neural network guess
Ŷ = U(X̂, p.layer_1, st)[1]

pl_reconstruction = scatter(tsample, transpose(Ŷ), xlabel = "t", ylabel ="U(x,y)", color = :red, label = ["UDE Approximation" nothing])
plot!(tsample, transpose(Ŷ), color = :red, lw = 2, style = :dash, label = [nothing nothing])
savefig(pl_reconstruction, "plots_missingterm_reconstruction.png")
pl_missing = plot(pl_trajectory, pl_reconstruction, layout = (2,1))
savefig(pl_missing, "plot_reconstruction.png")
## Symbolic regression via sparse regression (SINDy based)
# We reuse the basis and optimizer defined at the beginning

t1 = Float32.(tsample)
nn_problem = ContinuousDataDrivenProblem(X̂, t1, DX = Ŷ)

nn_res = solve(nn_problem, basis, opt, maxiter = 10000, progress = true, normalize = false, denoise = true)
println(nn_res)
println(result(nn_res))
plot(nn_res)

# Define the recovered, hyrid model with the rescaled dynamics
function recovered_dynamics!(du,u, p, t)
    û = nn_res(u, p[3:end]) # Network prediction
    du[1] = p[1]*u[1] + û[1]
    du[2] = -p[2]*u[2] + û[2]
end

p_model = [p_trained[1:2];parameters(nn_res)]

estimation_prob = ODEProblem(recovered_dynamics!, Xₙ[:, 1], tspan, p_model)
# Convert for reuse
sys = modelingtoolkitize(estimation_prob);
dudt = ODEFunction(sys);
estimation_prob = ODEProblem(dudt,Xₙ[:, 1], tspan, p_model)
estimate = solve(estimation_prob, Tsit5(), saveat = t)

##  Fit the found model
function loss_fit(θ)
    X̂ = Array(solve(estimation_prob, Tsit5(), p = θ, saveat = t))
    sum(abs2, X̂ .- Xₙ)
end

# Post-fit the model
optlf = Optimization.OptimizationFunction((x,p)->loss_fit(x), adtype)
optprob4 = Optimization.OptimizationProblem(optlf, Lux.ComponentArray(p))
res_fit = Optimization.solve(optprob4, Optim.BFGS(initial_stepnorm=0.1f0), callback=callback, maxiters = 1000)
p_fitted = res_fit.minimizer

# Estimate
estimate_rough = solve(estimation_prob, Tsit5(), saveat = 0.1*mean(diff(t)), p = p_model)
estimate = solve(estimation_prob, Tsit5(), saveat = 0.1*mean(diff(t)), p = p_fitted)

# Plot
pl_fitted = plot(t, transpose(Xₙ), style = :dash, lw = 2,color = :black, label = ["Measurements" nothing], xlabel = "t", ylabel = "x(t), y(t)")
plot!(estimate_rough, color = :red, label = ["Recovered" nothing])
plot!(estimate, color = :blue, label = ["Recovered + Fitted" nothing])
savefig(pl_fitted, "plots_recovery_fitting.png")

## Simulation

# Look at long term prediction
t_long = (0.0f0, 50.0f0)
estimate_long = solve(estimation_prob, Tsit5(), saveat = 0.25f0, tspan = t_long,p = p_fitted)
plot(estimate_long.t, transpose(xscale .* estimate_long[:,:]), xlabel = "t", ylabel = "x(t),y(t)")

## Save the results
save(joinpath(pwd(),"results","Hudson_Bay_recovery.jld2"),
    "X", Xₙ, "t" , t, "neural_network" , U, "initial_parameters", p, "trained_parameters" , p_trained, # Training
    "losses", losses, "result", nn_res, "recovered_parameters", parameters(nn_res), # Recovery
    "model", recovered_dynamics!, "model_parameter", p_model, "fitted_parameter", p_fitted,
    "long_estimate", estimate_long) # Estimation

## Post Processing and Plots

c1 = 3 # RGBA(174/255,192/255,201/255,1) # Maroon
c2 = :orange # RGBA(132/255,159/255,173/255,1) # Red
c3 = :blue # RGBA(255/255,90/255,0,1) # Orange
c4 = :purple # RGBA(153/255,50/255,204/255,1) # Purple

p3 = scatter(t, transpose(Xₙ), color = [c1 c2], label = ["x data" "y data"],
             title = "Recovered Model from Hudson Bay Data",
             titlefont = "Helvetica", legendfont = "Helvetica",
             markersize = 5)

plot!(p3,estimate_long, color = [c3 c4], lw=1, label = ["Estimated x(t)" "Estimated y(t)"])
plot!(p3,[19.99,20.01],[0.0,maximum(Xₙ)*1.25],lw=1,color=:black, label = nothing)
annotate!([(10.0,maximum(Xₙ)*1.25,text("Training \nData",12 , :center, :top, :black, "Helvetica"))])
savefig(p3, "plots_full_plot.png")
RajDandekar commented 1 year ago

@ccrnn I will try to work this out on my end..

RajDandekar commented 1 year ago

@ccrnn I found some potential errors in your code. Can you try making these changes?

You have this:

# Neural network guess
Ŷ = U(X̂, p.layer_1, st)[1]

which should be

# Neural network guess
Ŷ = U(X̂, p_trained.layer_1, st)[1]

p_model definition should be like this:

p_model = [p_trained.layer_2[1]; p_trained.layer_2[2];parameters(nn_res)]

Also, this is not correct:

optprob4 = Optimization.OptimizationProblem(optlf, Lux.ComponentArray(p))

You should have p_model, instead of p.

ghost commented 1 year ago

Thanks, I went to try this, but now even before I get to that point, at the BFGS shooting loss, it errors. There is also a problem with the predict function without X = Xₙ[:, 1], T = t now. I have no idea how this is happening, because I did get it to run this morning?! I tried with a very small number of iterations (15) for optprob2, and it works, but the prediction for X̂ is completely wrong now.

In general, I think it would be good if the Lux documentation were updated with some examples of how to set this up too. It's not really clear enough how to use it for a beginner atm.

I will have a go at FENEP again next. Tbh I'm a little frustrated at this point though. I don't understand why it doesn't work now.

ghost commented 1 year ago

@RajDandekar

ChrisRackauckas commented 1 year ago

You don't need to ping every time. Github will email everyone subscribed to an issue, and one is automatically subscribed after joining the thread.

RajDandekar commented 1 year ago

Try this code below. It should not give any errors:

using DiffEqFlux, Flux
using DifferentialEquations
using Plots, DelimitedFiles, Statistics
using Sundials
using OrdinaryDiffEq
using ModelingToolkit
using DataDrivenDiffEq
using LinearAlgebra
using Random
using Optimization, OptimizationOptimJL #OptimizationOptimisers for ADAM and OptimizationOptimJL for BFGS
using Lux
using Statistics

Random.seed!(5443)

#### NOTE
# Since the recent release of DataDrivenDiffEq v0.6.0 where a complete overhaul of the optimizers took
# place, SR3 has been used. Right now, STLSQ performs better and has been changed.
# Additionally, the behaviour of the optimization has changed slightly. This has been adjusted
# by decreasing the tolerance of the gradient.

svname = "HudsonBay"
## Data Preprocessing
# The data has been taken from https://jmahaffy.sdsu.edu/courses/f00/math122/labs/labj/q3v1.htm
# Originally published in E. P. Odum (1953), Fundamentals of Ecology, Philadelphia, W. B. Saunders
hudson_bay_data = readdlm("hudson_bay_data.dat", '\t', Float32, '\n')
# Measurements of prey and predator
Xₙ = Matrix(transpose(hudson_bay_data[:, 2:3]))
t = hudson_bay_data[:, 1] .- hudson_bay_data[1, 1]
# Normalize the data; since the data domain is strictly positive
# we just need to divide by the maximum
xscale = maximum(Xₙ, dims =2)
Xₙ .= 1f0 ./ xscale .* Xₙ
# Time from 0 -> n
tspan = (t[1], t[end])

# Plot the data
scatter(t, transpose(Xₙ), xlabel = "t", ylabel = "x(t), y(t)")
plot!(t, transpose(Xₙ), xlabel = "t", ylabel = "x(t), y(t)")

## Direct Identification via SINDy + Collocation

# Create the problem using a gaussian kernel for collocation
full_problem = ContinuousDataDrivenProblem(Xₙ, t, DataDrivenDiffEq.GaussianKernel())
# Look at the collocation
plot(full_problem.t, full_problem.X')
plot(full_problem.t, full_problem.DX')

# Create a Basis
@variables u[1:2]

# Generate the basis functions, multivariate polynomials up to deg 5
# and sine
b = [polynomial_basis(u, 5); sin.(u)]
basis = Basis(b, u)

# Create the thresholds which should be used in the search process
λ = Float32.(exp10.(-7:0.1:5))
# Create an optimizer for the SINDy problem
opt = STLSQ(λ)

# Best result so far
full_res = solve(full_problem, basis, opt, maxiter = 10000, progress = true, denoise = true, normalize = true)

println(full_res)
println(result(full_res))

## Define the network
# Gaussian RBF as activation
rbf(x) = exp.(-(x.^2))

# Define the network 2->5->5->5->2
U = Lux.Chain(
    Lux.Dense(2,5,rbf), Lux.Dense(5,5, rbf), Lux.Dense(5,5, tanh), Lux.Dense(5,2)
)

rng = Random.default_rng()
p1, st = Lux.setup(rng, U)

#for birth, decay parameters -> initializing random values.
parameter_array = Float64[0.5, 0.5]
p = (layer_1 = p1, layer_2 = parameter_array)
p = Lux.ComponentArray(p)

# Define the hybrid model
function ude_dynamics!(du,u, p, t)
    û = U(u, p.layer_1, st)[1] # Network prediction
    # We assume a linear birth rate for the prey
    du[1] = p.layer_2[1]*u[1] + û[1]
    # We assume a linear decay rate for the predator
    du[2] = -p.layer_2[2]*u[2] + û[2]
end

# Define the problem
prob_nn = ODEProblem(ude_dynamics!,Xₙ[:, 1], tspan, p)

## Function to train the network
# Define a predictor
function predict(θ)
    Array(solve(prob_nn, Vern7(), u0 = Xₙ[:, 1], p=θ,
                saveat = t,
                abstol=1e-6, reltol=1e-6,
                sensealg = ForwardDiffSensitivity()
                ))
end

# Define parameters for Multiple Shooting
group_size = 5
continuity_term = 200.0f0

function loss(data, pred)
    return sum(abs2, data - pred)
end

function shooting_loss(p)
    return multiple_shoot(p, Xₙ, t, prob_nn, loss, Vern7(),
                          group_size; continuity_term)
end

function loss(θ)
    X̂ = predict(θ)
    sum(abs2, Xₙ - X̂) / size(Xₙ, 2) + convert(eltype(θ), 1e-3)*sum(abs2, θ[3:end]) ./ length(θ[3:end])
end

# Container to track the losses
losses = Float32[]

# Callback to show the loss during training
callback(θ,args...) = begin
    l = loss(θ) # Equivalent L2 loss
    push!(losses, l)
    if length(losses)%5==0
        println("Current loss after $(length(losses)) iterations: $(losses[end])")
    end
    false
end

## Training -> First shooting / batching to get a rough estimate

# First train with ADAM for better convergence -> move the parameters into a
# favourable starting positing for BFGS
adtype = Optimization.AutoZygote()
optf = Optimization.OptimizationFunction((x,p)->shooting_loss(x), adtype)
optprob = Optimization.OptimizationProblem(optf, Lux.ComponentArray(p))
res1 = Optimization.solve(optprob, ADAM(0.1f0), callback=callback, maxiters = 100)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
# Train with BFGS to achieve partial fit of the data
optprob2 = remake(optprob,u0 = res1.u)
res2 = Optimization.solve(optprob2, Optim.BFGS(initial_stepnorm=0.01f0), callback=callback, maxiters = 500)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
# Full L2-Loss for full prediction

optf2 = Optimization.OptimizationFunction((x,p)->loss(x), adtype)
optprob2 = Optimization.OptimizationProblem(optf2, res2.minimizer)
res3 = Optimization.solve(optprob2, Optim.BFGS(initial_stepnorm=0.01f0), callback=callback, maxiters = 10000)
println("Final training loss after $(length(losses)) iterations: $(losses[end])")

pl_losses = plot(1:101, losses[1:101], yaxis = :log10, xaxis = :log10, xlabel = "Iterations", ylabel = "Loss", label = "ADAM (Shooting)", color = :blue)
plot!(102:302, losses[102:302], yaxis = :log10, xaxis = :log10, xlabel = "Iterations", ylabel = "Loss", label = "BFGS (Shooting)", color = :red)
plot!(302:length(losses), losses[302:end], color = :black, label = "BFGS (L2)")
savefig(pl_losses, "plot_losses.png")

# Rename the best candidate
p_trained = res3.minimizer

## Analysis of the trained network
# Interpolate the solution
tsample = t[1]:0.5:t[end]
X̂ = Array(solve(prob_nn, Vern7(), u0 = Xₙ[:, 1], p=p_trained,
            saveat = tsample,
            abstol=1e-6, reltol=1e-6,
            sensealg = ForwardDiffSensitivity()
            ))
# Trained on noisy data vs real solution
pl_trajectory = scatter(t, transpose(Xₙ), color = :black, label = ["Measurements" nothing], xlabel = "t", ylabel = "x(t), y(t)")
plot!(tsample, transpose(X̂), color = :red, label = ["UDE Approximation" nothing])
savefig(pl_trajectory, "plot_trajectory_reconstruction.png")

# Neural network guess
Ŷ = U(X̂, p_trained.layer_1, st)[1]

pl_reconstruction = scatter(tsample, transpose(Ŷ), xlabel = "t", ylabel ="U(x,y)", color = :red, label = ["UDE Approximation" nothing])
plot!(tsample, transpose(Ŷ), color = :red, lw = 2, style = :dash, label = [nothing nothing])
savefig(pl_reconstruction, "plots_missingterm_reconstruction.png")
pl_missing = plot(pl_trajectory, pl_reconstruction, layout = (2,1))
savefig(pl_missing, "plot_reconstruction.png")
## Symbolic regression via sparse regression (SINDy based)
# We reuse the basis and optimizer defined at the beginning

t1 = Float32.(tsample)
nn_problem = ContinuousDataDrivenProblem(X̂, t1, DX = Ŷ)

nn_res = solve(nn_problem, basis, opt, maxiter = 10000, progress = true, normalize = false, denoise = true)
println(nn_res)
println(result(nn_res))
plot(nn_res)

# Define the recovered, hyrid model with the rescaled dynamics
function recovered_dynamics!(du,u, p, t)
    û = nn_res(u, p[3:end]) # Network prediction
    du[1] = p[1]*u[1] + û[1]
    du[2] = -p[2]*u[2] + û[2]
end

p_model = [p_trained.layer_2[1]; p_trained.layer_2[2];parameters(nn_res)]

estimation_prob = ODEProblem(recovered_dynamics!, Xₙ[:, 1], tspan, p_model)
# Convert for reuse
sys = modelingtoolkitize(estimation_prob);
dudt = ODEFunction(sys);
estimation_prob = ODEProblem(dudt,Xₙ[:, 1], tspan, p_model)
estimate = solve(estimation_prob, Tsit5(), saveat = t)

##  Fit the found model
ghost commented 1 year ago

I get the same error on BFGS with this as I was getting with mine this evening...

RajDandekar commented 1 year ago

I ran the above code on my end again and am getting no error.

Where is it you get the error exactly and what's the error?

ghost commented 1 year ago

I just opened up a new folder and started again. So maybe I need to delete .julia and start from nothing, I guess it means there's some conflict somewhere?

The error appears when I run:

optprob2 = remake(optprob,u0 = res1.u)
res2 = Optimization.solve(optprob2, Optim.BFGS(initial_stepnorm=0.01f0), callback=callback, maxiters = 10000, g_tol = 1e-10)
println("Training loss after $(length(losses)) iterations: $(losses[end])")

Error:

OptimizationProblem. In-place: true
u0: ComponentVector{Float64}(layer_1 = (layer_1 = (weight = [-0.24772041390523952 1.5930269842767066; 1.4468453421253231 -0.25560103289443653; … ; 1.4510345926144106 0.3597015921762381; -0.8916708339364791 -0.15131018345691918], bias = [0.584705545781979; 0.5950046500415399; … ; 0.05439397879301729; -0.9176169869954031;;]), layer_2 = (weight = [0.25640126206130737 -0.3060469477110003 … 0.0666256229826155 -0.11863485689795789; 1.1395510754218598 0.6717503580918298 … 1.4375713543870927 0.7764497039190313; … ; 0.40670341708478464 0.9287096503298288 … 0.8801851698970032 1.265713208697712; -0.22153756370876834 -1.272067985745647 … -1.2827952131361973 -0.7903519283095344], bias = [-0.12406791904995043; 1.0873750905052009; … ; 0.8251442571608298; -0.8269018254313091;;]), layer_3 = (weight = [-1.0105734221945892 -0.030580921657496336 … -0.9918832650348256 0.04806644475768783; -0.7251584995166175 -0.3429721626672943 … 0.26247290180655536 -0.7362423740092084; … ; 0.8759786679411482 -0.46789888114076106 … 0.3909730825383274 0.7260443280593856; -0.09248583140058494 0.30920840435291713 … 0.013407678291042118 -0.08974604362033743], bias = [-0.5449527607336956; -0.2890426287702807; … ; 0.06224099376041165; -0.04742441157525139;;]), layer_4 = (weight = [-0.11548791677392826 -0.19890385102724592 … -0.41710974762784514 -0.004830400362544057; -0.13625351803509123 -0.13584390346099934 … 0.3724103512615741 0.0058752788761664634], bias = [0.20213157184398053; 0.04541601634252972;;])), layer_2 = [-0.5907198243155776, 1.7248052956285491])

Current loss after 180 iterations: 0.14546907
Current loss after 185 iterations: 0.1435748
Current loss after 190 iterations: 0.149434
Current loss after 195 iterations: 0.16056807
Current loss after 200 iterations: 0.17362846
Current loss after 205 iterations: 0.13301256
Current loss after 210 iterations: 0.13546903
Current loss after 215 iterations: 0.14567284
┌ Warning: dt(1.1920929e-7) <= dtmin(1.1920929e-7) at t=0.013189854. Aborting. There is either an error in your model specification or the true solution is unstable.
└ @ SciMLBase C:\Users\44745\.julia\packages\SciMLBase\cGK2b\src\integrator_interface.jl:422
┌ Warning: dt(4.7683716e-7) <= dtmin(4.7683716e-7) at t=4.0138335. Aborting. There is either an error in your model specification or the true solution is unstable.  
└ @ SciMLBase C:\Users\44745\.julia\packages\SciMLBase\cGK2b\src\integrator_interface.jl:422
┌ Warning: dt(9.536743e-7) <= dtmin(9.536743e-7) at t=8.013171. Aborting. There is either an error in your model specification or the true solution is unstable.     
└ @ SciMLBase C:\Users\44745\.julia\packages\SciMLBase\cGK2b\src\integrator_interface.jl:422
┌ Warning: dt(9.536743e-7) <= dtmin(9.536743e-7) at t=12.013128. Aborting. There is either an error in your model specification or the true solution is unstable.    
└ @ SciMLBase C:\Users\44745\.julia\packages\SciMLBase\cGK2b\src\integrator_interface.jl:422
┌ Warning: dt(1.9073486e-6) <= dtmin(1.9073486e-6) at t=16.013079. Aborting. There is either an error in your model specification or the true solution is unstable.  
└ @ SciMLBase C:\Users\44745\.julia\packages\SciMLBase\cGK2b\src\integrator_interface.jl:422
ERROR: MethodError: Cannot `convert` an object of type Nothing to an object of type Float64
Closest candidates are:
  convert(::Type{T}, ::Gray24) where T<:Real at C:\Users\44745\.julia\packages\ColorTypes\1dGw6\src\conversions.jl:114
  convert(::Type{T}, ::Gray) where T<:Real at C:\Users\44745\.julia\packages\ColorTypes\1dGw6\src\conversions.jl:113
  convert(::Type{T}, ::Unitful.Gain) where T<:Real at C:\Users\44745\.julia\packages\Unitful\SUQzL\src\logarithm.jl:62
  ...
Stacktrace:
  [1] fill!(A::ComponentVector{Float64}, x::Nothing)
    @ Base .\multidimensional.jl:1062
  [2] copyto!
    @ .\broadcast.jl:921 [inlined]
  [3] materialize!
    @ .\broadcast.jl:871 [inlined]
  [4] materialize!(dest::ComponentVector{Float64}, bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(identity), Tuple{Base.RefValue{Nothing}}})
    @ Base.Broadcast .\broadcast.jl:868
  [5] (::Optimization.var"#125#135"{Optimization.var"#124#134"{OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}})(::ComponentVector{Float64}, ::ComponentVector{Float64})
    @ Optimization C:\Users\44745\.julia\packages\Optimization\k85Rf\src\function\zygote.jl:32
  [6] (::OptimizationOptimJL.var"#5#13"{OptimizationProblem{true, OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, OptimizationOptimJL.var"#4#12"{OptimizationProblem{true, OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, OptimizationFunction{false, Optimization.AutoZygote, OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Optimization.var"#125#135"{Optimization.var"#124#134"{OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}}, Optimization.var"#128#138"{Optimization.var"#124#134"{OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}}, Optimization.var"#133#143", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}}, OptimizationFunction{false, Optimization.AutoZygote, OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Optimization.var"#125#135"{Optimization.var"#124#134"{OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}}, Optimization.var"#128#138"{Optimization.var"#124#134"{OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, SciMLBase.NullParameters}}, Optimization.var"#133#143", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}})(G::ComponentVector{Float64}, θ::ComponentVector{Float64})
    @ OptimizationOptimJL C:\Users\44745\.julia\packages\OptimizationOptimJL\iyLQi\src\OptimizationOptimJL.jl:106
  [7] value_gradient!!(obj::TwiceDifferentiable{Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}, x::ComponentVector{Float64})
    @ NLSolversBase C:\Users\44745\.julia\packages\NLSolversBase\cfJrN\src\interface.jl:82
  [8] value_gradient!(obj::TwiceDifferentiable{Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}, x::ComponentVector{Float64})
    @ NLSolversBase C:\Users\44745\.julia\packages\NLSolversBase\cfJrN\src\interface.jl:69
  [9] value_gradient!(obj::Optim.ManifoldObjective{TwiceDifferentiable{Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}}, x::ComponentVector{Float64})
    @ Optim C:\Users\44745\.julia\packages\Optim\6Lpjy\src\Manifolds.jl:50
 [10] (::LineSearches.var"#ϕdϕ#6"{Optim.ManifoldObjective{TwiceDifferentiable{Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}})(α::Float64)
    @ LineSearches C:\Users\44745\.julia\packages\LineSearches\Ki4c5\src\LineSearches.jl:84
 [11] (::LineSearches.HagerZhang{Float64, Base.RefValue{Bool}})(ϕ::Function, ϕdϕ::LineSearches.var"#ϕdϕ#6"{Optim.ManifoldObjective{TwiceDifferentiable{Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}, c::Float64, phi_0::Float64, dphi_0::Float64)
    @ LineSearches C:\Users\44745\.julia\packages\LineSearches\Ki4c5\src\hagerzhang.jl:139
 [12] HagerZhang
    @ C:\Users\44745\.julia\packages\LineSearches\Ki4c5\src\hagerzhang.jl:101 [inlined]
 [13] perform_linesearch!(state::Optim.BFGSState{ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Float32, Flat}, d::Optim.ManifoldObjective{TwiceDifferentiable{Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}})
    @ Optim C:\Users\44745\.julia\packages\Optim\6Lpjy\src\utilities\perform_linesearch.jl:59
 [14] update_state!(d::TwiceDifferentiable{Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}, state::Optim.BFGSState{ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Float32, Flat})
    @ Optim C:\Users\44745\.julia\packages\Optim\6Lpjy\src\multivariate\solvers\first_order\bfgs.jl:139
 [15] optimize(d::TwiceDifferentiable{Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}, initial_x::ComponentVector{Float64}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Float32, Flat}, options::Optim.Options{Float64, OptimizationOptimJL.var"#_cb#11"{typeof(callback), BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Float32, Flat}, Base.Iterators.Cycle{Tuple{Optimization.NullData}}}}, state::Optim.BFGSState{ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}})
    @ Optim C:\Users\44745\.julia\packages\Optim\6Lpjy\src\multivariate\optimize\optimize.jl:54
 [16] optimize(d::TwiceDifferentiable{Float64, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentMatrix{Float64, Matrix{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}, ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}}, initial_x::ComponentVector{Float64}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Float32, Flat}, options::Optim.Options{Float64, OptimizationOptimJL.var"#_cb#11"{typeof(callback), BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Float32, Flat}, Base.Iterators.Cycle{Tuple{Optimization.NullData}}}})
    @ Optim C:\Users\44745\.julia\packages\Optim\6Lpjy\src\multivariate\optimize\optimize.jl:36
 [17] ___solve(prob::OptimizationProblem{true, OptimizationFunction{true, Optimization.AutoZygote, var"#5#6", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, ComponentArrays.ComponentVector{Float64, Vector{Float64}, Tuple{ComponentArrays.Axis{(layer_1 = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), layer_2 = 88:89)}}}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, opt::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Float32, Flat}, data::Base.Iterators.Cycle{Tuple{Optimization.NullData}}; callback::Function, maxiters::Int64, maxtime::Nothing, abstol::Nothing, reltol::Nothing, progress::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ OptimizationOptimJL C:\Users\44745\.julia\packages\OptimizationOptimJL\iyLQi\src\OptimizationOptimJL.jl:143
 [18] #__solve#2
    @ C:\Users\44745\.julia\packages\OptimizationOptimJL\iyLQi\src\OptimizationOptimJL.jl:56 [inlined]
 [19] #solve#514
    @ C:\Users\44745\.julia\packages\SciMLBase\cGK2b\src\solve.jl:71 [inlined]
 [20] top-level scope
    @ Untitled-1:152

Training loss after 215 iterations: 0.14567284
AlCap23 commented 1 year ago

On it. ( with a little delay and probably a little more ).

eliascrodrigues commented 1 year ago

@ccrnn were you able to run the code? Is it working now?