Open Nicholaswogan opened 3 years ago
Interesting, I can reproduce it. Following are my results. Even without interpolation, I am seeing it's taking more time than Julia's numbers you published. Need to investigate this.
--- FLINT Stat Results ( 100 loops) ---
A. Natural Step-size
Method Time (ms) FCalls Accepted Rejected
DOP54 0.328E+01 63617 10575 30
DOP853 0.172E+01 37349 2466 705
Verner65E 0.344E+01 53682 6640 78
Verner98R 0.281E+01 39628 2141 358
B. Interpolated Grid
Method Time (ms) FCalls Accepted Rejected
DOP54 0.594E+01 63617 10575 30
DOP853 0.344E+01 44747 2466 705
Verner65E 0.688E+01 60322 6640 78
Verner98R 0.578E+01 50333 2141 358
@Nicholaswogan So I tried to reproduce your Julia benchmark results without using the benchmarking function but with a simple loop to make sure the FLINT test script and your script are identical. And I am seeing much worse performance for Julia code this way and not entirely sure why. See your modified Julia script and results on my machine.
`using OrdinaryDiffEq, StaticArrays, BenchmarkTools, Sundials, LSODA, ModelingToolkit
function lorenz_static(u,p,t) @inbounds begin dx = p[1](u[2]-u[1]) dy = u[1](p[2]-u[3]) - u[2] dz = u[1]u[2] - p[3]u[3] end SA[dx,dy,dz] end u0 = SA[1.0,0.0,0.0] p = SA[10.0,28.0,8/3] tspan = (0.0,100.0)
prob = ODEProblem(lorenz_static,u0,tspan,p)
rtime = @elapsed for i in 1:100 solve(prob,Tsit5(),saveat=0.1,reltol=1.0e-8,abstol=1.0e-8) # 2.612 ms (30 allocations: 59.22 KiB) end println("Tsit5: ", rtime/100*1000, " ms")
rtime = @elapsed for i in 1:100 solve(prob,Vern8(),saveat=0.1,reltol=1.0e-8,abstol=1.0e-8); # 1.803 ms (31 allocations: 65.70 KiB) end println("Vern8: ", rtime/100*1000, " ms")`
julia> include("TestLorenz.jl") Lorenz Tsit5: 15.803541000000001 ms Vern8: 38.870095 ms
julia> include("TestLorenz.jl") Lorenz Tsit5: 13.455046 ms Vern8: 38.820251999999996 ms
@princemahajan I get the same times as you, when I run that exact julia script. But for some reason I get faster times when I uncomment the @btime
:
using OrdinaryDiffEq, StaticArrays, BenchmarkTools, Sundials, LSODA, ModelingToolkit
function lorenz_static(u,p,t)
@inbounds begin
dx = p[1]*(u[2]-u[1])
dy = u[1]*(p[2]-u[3]) - u[2]
dz = u[1]*u[2] - p[3]*u[3]
end
SA[dx,dy,dz]
end
u0 = SA[1.0,0.0,0.0]
p = SA[10.0,28.0,8/3]
tspan = (0.0,100.0)
prob = ODEProblem(lorenz_static,u0,tspan,p)
@btime solve(prob,Tsit5(),saveat=0.1,reltol=1.0e-8,abstol=1.0e-8) # 2.612 ms (30 allocations: 59.22 KiB)
@btime solve(prob,Vern8(),saveat=0.1,reltol=1.0e-8,abstol=1.0e-8); # 1.803 ms (31 allocations: 65.70 KiB)
rtime = @elapsed for i in 1:100
solve(prob,Tsit5(),saveat=0.1,reltol=1.0e-8,abstol=1.0e-8) # 2.612 ms (30 allocations: 59.22 KiB)
end
println("Tsit5: ", rtime/100*1000, " ms")
rtime = @elapsed for i in 1:100
solve(prob,Vern8(),saveat=0.1,reltol=1.0e-8,abstol=1.0e-8); # 1.803 ms (31 allocations: 65.70 KiB)
end
println("Vern8: ", rtime/100*1000, " ms")
The results are
julia> include("TestLorenz.jl")
2.446 ms (31 allocations: 59.23 KiB)
1.399 ms (32 allocations: 65.72 KiB)
Tsit5: 2.60718273 ms
Vern8: 1.63626658 ms
julia> include("TestLorenz.jl")
2.450 ms (31 allocations: 59.23 KiB)
1.404 ms (32 allocations: 65.72 KiB)
Tsit5: 2.66984644 ms
Vern8: 1.7514147299999998 ms
yes i see that too. this is weird.
I also like to point out that generally in my benchmarking codes, I perturb the initial conditions slightly before every integration in the loop, this is closer to reality as you not gonna integrate the exact same initial conditions for the same time span again and again in practice. In julia diffeq, this will require the ODEProblem() invocation to be part of the timed loop. But in FLINT, you specify initial conditions directly in Integrate() method so this is more efficient this way.
I thought I found the issue but not. With @cpuelapsed macro that is closer to cpu_time() function I am using in FLINT for timing the code, the results are always same. FLINT's solvers do seem to be taking more time for this equation than Julia's TSIT5 and Verner8 solvers. If I can extract the number of function calls made and accepted/rejected steps from Julia solvers, then we will get an idea why is that.
julia> include("TestLorenz.jl")
Tsit5: 12.633589 ms
Vern8: 35.851274000000004 ms
Tsit5: 3.43 ms
Vern8: 1.72 ms
You can get Julia ODE solver stats with
sol = solve(prob,Tsit5(),saveat=0.1,reltol=1.0e-8,abstol=1.0e-8);
sol.destats
result is
DiffEqBase.DEStats
Number of function 1 evaluations: 70647
Number of function 2 evaluations: 0
Number of W matrix evaluations: 0
Number of linear solves: 0
Number of Jacobians created: 0
Number of nonlinear solver iterations: 0
Number of nonlinear solver convergence failures: 0
Number of rootfind condition calls: 0
Number of accepted steps: 11774
Number of rejected steps: 0
Ok thanks, I had seen this but has forgotten. It’s amazing that Tsit5() it is not rejecting any steps. Its function calls are more than FLINT but still taking less time.
Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows
From: Nick Wogan @.> Sent: Monday, September 6, 2021 4:05:33 PM To: princemahajan/FLINT @.> Cc: Bharat Mahajan @.>; Mention @.> Subject: Re: [princemahajan/FLINT] Slower than Julia for Lorenz system? (#3)
You can get Julia ODE solver stats with
sol = solve(prob,Tsit5(),saveat=0.1,reltol=1.0e-8,abstol=1.0e-8); sol.destats
result is
DiffEqBase.DEStats Number of function 1 evaluations: 70647 Number of function 2 evaluations: 0 Number of W matrix evaluations: 0 Number of linear solves: 0 Number of Jacobians created: 0 Number of nonlinear solver iterations: 0 Number of nonlinear solver convergence failures: 0 Number of rootfind condition calls: 0 Number of accepted steps: 11774 Number of rejected steps: 0
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/princemahajan/FLINT/issues/3#issuecomment-913852713, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMK5YH6G2SACVIKXY2CHSMDUAUUJ3ANCNFSM5DPGSKAQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Following are the test results from the latest FLINT code and Julia Diffeq package. In these results, Julia's Tsit5 is slower, DP5 and DP8 take almost same time, and Vern6 and Vern9 are faster than FLINT's corresponding solvers.
--- FLINT Stats ---
A. Internal Step-size
Method Time(s) FCalls Accepted Rejected Total Steps
DOP54 0.127E+01 253225 42078 151 42229
DOP853 0.297E+00 74888 5839 438 6277
Verner65E 0.105E+01 177861 22123 125 22248
Verner98R 0.453E+00 73507 4385 223 4608
B. Interpolated Grid
Method Time(s) FCalls Accepted Rejected Total Steps
DOP54 0.230E+01 253225 42078 151 42229
DOP853 0.672E+00 92405 5839 438 6277
Verner65E 0.266E+01 199984 22123 125 22248
Verner98R 0.123E+01 95432 4385 223 4608
julia> sol = main();
Tsit5: 1.4893084 s
DP5: 1.2646739 s
DP8: 0.2551725 s
Vern6: 0.8604503 s
Vern9: 0.2845905 s
============ julia script:
##
using OrdinaryDiffEq, StaticArrays, BenchmarkTools, Sundials, LSODA, ModelingToolkit
function lorenz_static(u,p,t)
@inbounds begin
dx = p[1]*(u[2]-u[1])
dy = u[1]*(p[2]-u[3]) - u[2]
dz = u[1]*u[2] - p[3]*u[3]
end
SA[dx,dy,dz]
end
##
function main()
u0 = SA[1.0,0.0,0.0]
p = SA[10.0,28.0,8/3]
tspan = (0.0,100.0)
prob = ODEProblem(lorenz_static,u0,tspan,p)
#@btime solve(prob,Tsit5(),saveat=0.1,reltol=1.0e-11,abstol=1.0e-14) # 2.612 ms (30 allocations: 59.22 KiB)
#@btime solve(prob,Vern9(),saveat=0.1,reltol=1.0e-11,abstol=1.0e-14); # 1.803 ms (31 allocations: 65.70 KiB)
sol = 0
rtime = @elapsed for i in 1:100
sol = solve(prob,Tsit5(),reltol=1.0e-11,abstol=1.0e-14, dense=false) # 2.612 ms (30 allocations: 59.22 KiB)
end
println("Tsit5: ", rtime, " s")
rtime = @elapsed for i in 1:100
sol = solve(prob,DP5(),reltol=1.0e-11,abstol=1.0e-14,dense=false); # 1.803 ms (31 allocations: 65.70 KiB)
end
println("DP5: ", rtime, " s")
rtime = @elapsed for i in 1:100
sol = solve(prob,DP8(),reltol=1.0e-11,abstol=1.0e-14,dense=false); # 1.803 ms (31 allocations: 65.70 KiB)
end
println("DP8: ", rtime, " s")
rtime = @elapsed for i in 1:100
sol = solve(prob,Vern6(),reltol=1.0e-11,abstol=1.0e-14,dense=false); # 1.803 ms (31 allocations: 65.70 KiB)
end
println("Vern6: ", rtime, " s")
rtime = @elapsed for i in 1:100
sol = solve(prob,Vern9(),reltol=1.0e-11,abstol=1.0e-14,dense=false); # 1.803 ms (31 allocations: 65.70 KiB)
end
println("Vern9: ", rtime, " s")
return sol
end
Here are some micro julia benchmarks for Lorenz equations: https://github.com/Nicholaswogan/NumbaLSODA/tree/main/benchmark
With the best Julia methods, integrating to 100 seconds and interpolating takes a few milliseconds. I compared this to FLINT ( see attached), which is slower by a factor of at least 5.
Does this look right?
I had to adjust some of the FLINT compiler options in the CMakeLists.txt because I was getting errors. So maybe these adjustments are responsible for the speed difference.
I used gfortran 11 and compiled in Release mode.
FLINT-benchmark.zip