Closed sdwfrost closed 1 year ago
I think the problem is not strictly speaking with NeuralPDE but that depending on the initialization, the NN sometimes gets stuck in local minima. With networks this small, I think this is a real concern. Your code sometimes actually trains successfully for me using Adam, BFGS, or both. You could try making the network wider or deeper to increase chances of training success.
I've tried both wider and deeper (as well as both), and sometimes I get a sigmoidish curve, but still way off the true solution, and with many more parameters than a simple basis function approach would require. I was hoping that this example wouldn't require anything large or complex on the neural net side. The kind of problems I'm looking at will have the same problem as this simple logistic curve.
I'm more confused after playing around with it more.. Sometimes the loss is really low at the end (like 1e-10) but the solution is totally wrong. For successful runs the loss is more like 1e-7.
Sometimes I get NaN solutions / parameters without an error.
I also tried using Lux instead of Flux as this is recommended in the docs to use double precision. But using Lux I was never able to train successfully, the loss got lower more consistently but the solutions just don't seem to work.
Another thing I just realized is that the issue title references NNODE (https://docs.sciml.ai/NeuralPDE/stable/manual/ode/#NeuralPDE.NNODE) but the code doesn't actually use NNODE (or is it used internally?). Maybe the NNODE specialization would work better as this is an ODE here, but I could not get NNODE to work based on the example code there. I always get "Optimization algorithm not found. Either the chosen algorithm is not a valid solver
choice for the OptimizationProblem
, or the Optimization solver library is not loaded." even though everything is indeed loaded..
sometimes I get a sigmoidish curve, but still way off the true solution
Fwiw, the successful runs do come very close to the sigmoid function for me. I was just able to get a successful solution using Lux, and it looks like this:
But I am not really sure about the loss values. Sometimes it works, sometimes it doesn't, sometimes that is reflected in the loss, other times it doesn't seem to be correlated. Maybe it becomes clearer when looking at the individual loss terms (data and physics loss).
Sorry, the subject comes from a discussion of this issue on the Julia Discourse.
I got NNODE
to run, I was missing the Optimisers
package. But it doesn't really fit the function more reliably than using a PDESystem
+ PhysicsInformedNN
on this problem, for me actually less so. I really don't understand these highly variable loss values of the PINNs and how they relate to the solution quality. With the NNODE I achieve the same results as described by @sdwfrost, i.e. never really an exact fit as I sometimes got with the PDESystem
+ PhysicsInformedNN
code.
Here's my code for using the NNODE
:
using Flux
using NeuralPDE
using OrdinaryDiffEq, Optimisers
using Plots
import Lux, OptimizationOptimisers, OptimizationOptimJL
eq = (u, p, t) -> u .* (1 .- u)
tspan = (0.0, 10.0)
u0 = 0.01
prob = ODEProblem(eq, u0, tspan)
ode_sol = solve(prob, Tsit5(), saveat=0.1)
function run()
chain = Flux.Chain(Flux.Dense(1, 20, σ), Flux.Dense(20, 1))
luxchain = Lux.Chain(Lux.Dense(1, 20, Lux.σ), Lux.Dense(20, 1))
opt = OptimizationOptimJL.BFGS()
# opt = OptimizationOptimisers.Adam(0.1)
nnode_sol = solve(prob, NeuralPDE.NNODE(luxchain, opt), dt=1 / 10.0, verbose=true, abstol=1.0e-10, maxiters=5000)
ts = tspan[1]:0.01:tspan[2]
xpred = [nnode_sol(t) for t in ts]
plot(ode_sol, label="ODE solution")
plot!(ts, xpred, label="NNODE solution")
end
run()
I've been playing with this and applying a new training strategy to it, and I am getting the mean absolute difference between the nnode solution and ode solution is 0.02. With some other strategies, this difference is sometimes very large which confuses me for such a simple problem. Here is the code, I am going to try to make this better
using OrdinaryDiffEq, Optimisers
using Plots, Statistics
import Lux, OptimizationOptimisers, OptimizationOptimJL
eq = (u, p, t) -> u * (1 .- u)
tspan = (0.0, 10.0)
u0 = 0.01
prob = ODEProblem(eq, u0, tspan)
ode_sol = solve(prob, Tsit5(), saveat=0.01)
N = 4
func = Lux.σ
luxchain = Lux.Chain(Lux.Dense(1, N, func), Lux.Dense(N, N, func), Lux.Dense(N, 1,func))
# opt = OptimizationOptimJL.BFGS()
opt = OptimizationOptimisers.Adam(0.01)
weights = [1/14, 4/14, 4/14, 4/14, 1/14]
samples = 200
alg = NeuralPDE.NNODE(luxchain, opt, autodiff = false, strategy = NeuralPDE.WeightedIntervalTraining(weights, samples))
nnode_sol = solve(prob, alg, verbose=true, maxiters=50000, saveat=0.01)
ts = tspan[1]:0.01:tspan[2]
xpred = [nnode_sol(t) for t in ts]
print(abs(mean(ode_sol .- nnode_sol)))
using Plots
plot(ode_sol, label="ODE solution")
plot!(ts, xpred, label="NNODE solution")
Here is the plot
Seems like Quadrature Training works pretty well for this, I am getting mean difference in solutions to be 0.01 or less fairly consistently. Here is the code and a graph
using OrdinaryDiffEq, Optimisers, NeuralPDE
using Plots, Statistics
import Lux, OptimizationOptimisers, OptimizationOptimJL
eq = (u, p, t) -> u .* (1 .- u)
tspan = (0.0, 10.0)
u0 = 0.01
prob = ODEProblem(eq, u0, tspan)
ode_sol = solve(prob, Tsit5(), saveat=0.01)
N = 4
func = Lux.σ
luxchain = Lux.Chain(Lux.Dense(1, N, func), Lux.Dense(N, N -1, func), Lux.Dense(N -1, N-2, func), Lux.Dense(N - 2, 1, func))
opt = OptimizationOptimisers.Adam(0.001)
alg = NeuralPDE.NNODE(luxchain, opt, autodiff = false, strategy = NeuralPDE.QuadratureTraining())
nnode_sol = solve(prob, alg, verbose=true, abstol = 1e-10, maxiters=100000, saveat=0.01)
ts = tspan[1]:0.01:tspan[2]
xpred = [nnode_sol(t) for t in ts]
xreal = [ode_sol(t) for t in ts]
println(abs(mean(xpred .- xreal)))
println(abs(mean(nnode_sol .- ode_sol)))
using Plots
plot(ode_sol, label="ODE solution")
plot!(ts, xpred, label="NNODE solution")
👍 . I think we should change the docs to use QuadratureTraining in the tutorials and then close this. GridTraining shouldn't be used (as the docs say in other places).
@ChrisRackauckas we can close this? https://github.com/SciML/NeuralPDE.jl/pull/729 addresses using QuadratureTraining in the docs
NeuralPDE.jl fails on this simple example of a logistic curve. I'm not sure whether this would be helped by changing the neural net (as the output is between 0 and 1), though I've tried lots of structures, as well as trying to refine the optimization. The ADAM+BFGS should work OK for this example.