Closed ChrisRackauckas closed 4 years ago
For the Benchmarks, do you mean add more benchmarks or more text describing how it works/what it is doing? For the mass matrix ODE, is this LL-NN-Stiff.md?
For the Benchmarks, do you mean add more benchmarks or more text describing how it works/what it is doing?
Yeah, some more benchmarks between adjoints, maybe getting a copy of some of the tutorials running in torchdiffeq to see the timing difference.
For the mass matrix ODE, is this LL-NN-Stiff.md?
yes
Would be good to add a PDE-constrained optimized example:
using DelimitedFiles,Plots
using DiffEqSensitivity, OrdinaryDiffEq, Zygote, Flux, DiffEqFlux, Optim
# Problem setup parameters:
Lx = 10.0
x = 0.0:0.01:Lx
dx = x[2] - x[1]
Nx = size(x)
u0 = exp.(-(x.-3.0).^2) # I.C
## Problem Parameters
p = [1.0,1.0] # True solution parameters
xtrs = [dx,Nx] # Extra parameters
dt = 0.40*dx^2 # CFL condition
t0, tMax = 0.0 ,1000*dt
tspan = (t0,tMax)
t = t0:dt:tMax;
## Definition of Auxiliary functions
function ddx(u,dx)
"""
2nd order Central difference for 1st degree derivative
"""
return [[zero(eltype(u))] ; (u[3:end] - u[1:end-2]) ./ (2.0*dx) ; [zero(eltype(u))]]
end
function d2dx(u,dx)
"""
2nd order Central difference for 1st degree derivative
"""
return [[zero(eltype(u))]; (u[3:end] - 2.0.*u[2:end-1] + u[1:end-2]) ./ (dx^2); [zero(eltype(u))]]
end
## ODE descrition of the Physics:
function burgers(u,p,t)
# Model paramters
a0, a1 = p
dx,Nx = xtrs #[1.0,3.0,0.125,100]
#du .= a0 .* ddx(u.^2, dx) + a1 .* d2dx(u, dx) #.*ddx(ddx(u, dx) ,dx)
return 2.0*a0 .* u + a1 .* d2dx(u, dx) #.*ddx(ddx(u, dx) ,dx)
end
# Testing Solver on linear PDE
prob = ODEProblem(burgers,u0,tspan,p)
sol = concrete_solve(prob,Tsit5(), dt=dt,saveat=t);
#sol = solve(prob, AutoTsit5(Rosenbrock23()),dt=dt,tstop=tMax);
#sol = solve(prob, AutoVern9(Rodas4()), dt=dt,tstop=tMax);
#sol = concrete_solve(prob,Tsit5(),u0,p,saveat=t0:dt:tMax,abstol=1e-8, dt=dt,
# reltol=1e-6,sensealg=InterpolatingAdjoint(checkpointing=true));
plot(x, sol.u[1], lw=3, label="t0", size=(800,500))
plot!(x, sol.u[end],lw=3, ls=:dash, label="tMax")
ps = [0.1, 0.2]; # Initial guess for model parameters
# Defining Neural Net for HRT
function predict(θ) # Our 1-layer neural network
Array(concrete_solve(prob,Tsit5(),u0,θ,dt=dt,saveat=t))
#Array(concrete_solve(prob,Tsit5(),u0,p,saveat=t0:dt:tMax,abstol=1e-8, dt=dt,
# reltol=1e-6,sensealg=InterpolatingAdjoint(checkpointing=true)))
end
## Defining Loss function
function loss(θ)
pred = predict(θ)
l = predict(θ) - sol
return sum(abs2, l), pred # Mean squared error
end
l,pred = loss(ps)
size(pred), size(sol), size(t) # Checking sizes
Data = Iterators.repeated((), nIters) # Creating an iterator
LOSS = [] # Loss accumulator
PRED = [] # prediction accumulator
PARS = [] # parameters accumulator
cb = function (θ,l,pred) #callback function to observe training
display(l)
append!(PRED, [pred])
append!(LOSS, l)
append!(PARS, [θ])
false
end
cb(ps,loss(ps)...) # Testing callback function
# Let see prediction vs. Truth
scatter(sol[:,end], label="Truth", size=(800,500))
plot!(PRED[end][:,end], lw=2, label="Prediction")
res = DiffEqFlux.sciml_train(loss, ps, ADAM(0.01), cb = cb, maxiters = 100) # Let check gradient propagation
ps = res.minimizer
res = DiffEqFlux.sciml_train(loss, ps, BFGS(), cb = cb, maxiters = 100) # Let check gradient propagation
@show res.minimizer # returns [0.999999999999975, 1.0000000000000213]
Would be good to add a PDE-constrained optimized example:
using DelimitedFiles,Plots using DiffEqSensitivity, OrdinaryDiffEq, Zygote, Flux, DiffEqFlux, Optim # Problem setup parameters: Lx = 10.0 x = 0.0:0.01:Lx dx = x[2] - x[1] Nx = size(x) u0 = exp.(-(x.-3.0).^2) # I.C ## Problem Parameters p = [1.0,1.0] # True solution parameters xtrs = [dx,Nx] # Extra parameters dt = 0.40*dx^2 # CFL condition t0, tMax = 0.0 ,1000*dt tspan = (t0,tMax) t = t0:dt:tMax; ## Definition of Auxiliary functions function ddx(u,dx) """ 2nd order Central difference for 1st degree derivative """ return [[zero(eltype(u))] ; (u[3:end] - u[1:end-2]) ./ (2.0*dx) ; [zero(eltype(u))]] end function d2dx(u,dx) """ 2nd order Central difference for 1st degree derivative """ return [[zero(eltype(u))]; (u[3:end] - 2.0.*u[2:end-1] + u[1:end-2]) ./ (dx^2); [zero(eltype(u))]] end ## ODE descrition of the Physics: function burgers(u,p,t) # Model paramters a0, a1 = p dx,Nx = xtrs #[1.0,3.0,0.125,100] #du .= a0 .* ddx(u.^2, dx) + a1 .* d2dx(u, dx) #.*ddx(ddx(u, dx) ,dx) return 2.0*a0 .* u + a1 .* d2dx(u, dx) #.*ddx(ddx(u, dx) ,dx) end # Testing Solver on linear PDE prob = ODEProblem(burgers,u0,tspan,p) sol = concrete_solve(prob,Tsit5(), dt=dt,saveat=t); #sol = solve(prob, AutoTsit5(Rosenbrock23()),dt=dt,tstop=tMax); #sol = solve(prob, AutoVern9(Rodas4()), dt=dt,tstop=tMax); #sol = concrete_solve(prob,Tsit5(),u0,p,saveat=t0:dt:tMax,abstol=1e-8, dt=dt, # reltol=1e-6,sensealg=InterpolatingAdjoint(checkpointing=true)); plot(x, sol.u[1], lw=3, label="t0", size=(800,500)) plot!(x, sol.u[end],lw=3, ls=:dash, label="tMax") ps = [0.1, 0.2]; # Initial guess for model parameters # Defining Neural Net for HRT function predict(θ) # Our 1-layer neural network Array(concrete_solve(prob,Tsit5(),u0,θ,dt=dt,saveat=t)) #Array(concrete_solve(prob,Tsit5(),u0,p,saveat=t0:dt:tMax,abstol=1e-8, dt=dt, # reltol=1e-6,sensealg=InterpolatingAdjoint(checkpointing=true))) end ## Defining Loss function function loss(θ) pred = predict(θ) l = predict(θ) - sol return sum(abs2, l), pred # Mean squared error end l,pred = loss(ps) size(pred), size(sol), size(t) # Checking sizes Data = Iterators.repeated((), nIters) # Creating an iterator LOSS = [] # Loss accumulator PRED = [] # prediction accumulator PARS = [] # parameters accumulator cb = function (θ,l,pred) #callback function to observe training display(l) append!(PRED, [pred]) append!(LOSS, l) append!(PARS, [θ]) false end cb(ps,loss(ps)...) # Testing callback function # Let see prediction vs. Truth scatter(sol[:,end], label="Truth", size=(800,500)) plot!(PRED[end][:,end], lw=2, label="Prediction") res = DiffEqFlux.sciml_train(loss, ps, ADAM(0.01), cb = cb, maxiters = 100) # Let check gradient propagation ps = res.minimizer res = DiffEqFlux.sciml_train(loss, ps, BFGS(), cb = cb, maxiters = 100) # Let check gradient propagation @show res.minimizer # returns [0.999999999999975, 1.0000000000000213]
Is this code done or is there something more to add? Should I make a new markdown file?
Yeah it should just work. It just needs the comments cleaned up and some explanations
An example with complex numbers from @seadra:
using DiffEqSensitivity, OrdinaryDiffEq, Zygote, LinearAlgebra, FiniteDiff, Test, DiffEqFlux, Optim
const T = 10.0;
const ω = π/T;
function f(u, p, t)
a = p[1]*sin(ω*t) + p[2]*sin(2*ω*t) + p[3]*sin(3*ω*t) + p[4]*sin(4*ω*t);
A = [1.0 a; a -1.0];
return -im*A*u;
end
u0 = [Complex{Float64}(1) 0; 0 1];
tspan = (0.0, T)
p = Complex{Float64}[0.7, 1.0, 3.0, 1.0]
prob_ode = ODEProblem{false}(f, u0, tspan, p);
sol_ode = solve(prob_ode, Tsit5());
utarget = [Complex{Float64}(0) im; im 0];
function predict_adjoint(p)
return solve(prob_ode, Tsit5(), p=p, abstol=1e-12, reltol=1e-12)
end
function loss_adjoint(p)
prediction = predict_adjoint(p)
usol = last(prediction)
loss = 1.0 - abs(tr(usol*utarget')/2)^2
return loss
end
grad1 = Zygote.gradient(loss_adjoint,Complex{Float64}[1.5, 1.0, 3.0, 1.0])[1]
grad2 = FiniteDiff.finite_difference_gradient(loss_adjoint,Complex{Float64}[1.5, 1.0, 3.0, 1.0])
@test grad1 ≈ grad2 # <- succeeds
res = DiffEqFlux.sciml_train(loss_adjoint, p, BFGS(initial_stepnorm = 0.0001)) # <- fails
res.minimizer # Complex{Float64}[0.3601285857150311 + 0.0im, 0.9525963326731084 + 0.0im, 3.2163299056173216 + 0.0im, 0.9696107069145927 + 0.0im]
loss_adjoint(res.minimizer) # 1.2534417948018017e-12
I think in general this is done, and specifics should get an issue. Thanks everyone!
There's a lot of parts of the documentation that can be better described: