remove array specialization

SciML / OrdinaryDiffEq.jl

High performance ordinary differential equation (ODE) and differential-algebraic equation (DAE) solvers, including neural ordinary differential equations (neural ODEs) and scientific machine learning (SciML)

https://diffeq.sciml.ai/latest/

Other

521 stars 198 forks source link

remove array specialization #2271

Closed oscardssmith closed 3 days ago

oscardssmith commented 3 days ago

With FastBroadcast.jl, we expect the only benefit of for loops over broadcasting to be in the macroexpand/lowering time (which makes this extra code a pure loading pesimization since lowering doesn't care about types so having both versions is strictly worse than having only 1). I need to do some tests to make sure this doesn't regress performance, but I'm pretty sure it won't.

ChrisRackauckas commented 3 days ago

This should get tested

oscardssmith commented 3 days ago

agreed. As I said, I still need to do some performance testing.

oscardssmith commented 3 days ago

Benchmarks:

using OrdinaryDiffEq
f(du, u, p, t) = du .= u
prob =ODEProblem(f, rand(1), (0.0, 1.0))

@btime solve(prob, Rosenbrock23())

# before
  12.360 μs (126 allocations: 9.62 KiB)
#after
  12.520 μs (126 allocations: 9.62 KiB)

@btime solve(prob, Vern7(lazy=false), abstol=1e-9, reltol=1e-9)
#before
  12.200 μs (220 allocations: 16.52 KiB)
#after
  12.330 μs (220 allocations: 16.52 KiB)

Seems to be well within the noise (especially considering that a single variable array is literally the worst case since then any overhead would be magnified.

ChrisRackauckas commented 3 days ago

That's using @btime though. What's the compilation performance?

oscardssmith commented 3 days ago

Fully noise. In a fresh REPL for each line of output: Before:

julia> @time solve(prob, Rosenbrock23())
  1.691462 seconds (1.87 M allocations: 128.175 MiB, 1.51% gc time, 99.98% compilation time)
julia> @time solve(prob, Rosenbrock23());
  1.747171 seconds (1.87 M allocations: 128.154 MiB, 1.83% gc time, 99.98% compilation time)
julia> @time solve(prob, Rosenbrock23());
  1.843381 seconds (1.87 M allocations: 128.154 MiB, 4.54% gc time, 99.99% compilation time)

After:

julia> @time solve(prob, Rosenbrock23());
  1.805668 seconds (2.35 M allocations: 159.500 MiB, 2.87% gc time, 99.99% compilation time)
julia> @time solve(prob, Rosenbrock23());
  1.797002 seconds (2.35 M allocations: 159.529 MiB, 2.47% gc time, 99.99% compilation time)
julia> @time solve(prob, Rosenbrock23());
  1.837029 seconds (2.35 M allocations: 159.529 MiB, 2.13% gc time, 99.98% compilation time)

ChrisRackauckas commented 3 days ago

Looks like that's actually measurable? But it's small enough now that we can do this. It used to be like a 10 second difference, I'm happy that's solved.

oscardssmith commented 3 days ago

and also, the other reason why this PR clearly is needed is that it found a bug in Rosenbrock23 due to a previously untested path :)