Closed ChrisRackauckas closed 5 years ago
Dear @ChrisRackauckas
Cool! Nice to see a systematic comparison. Crazy teaching week this week, but next week during my protected time at the ATI I'll work on formatting some of my problems into a PR for https://github.com/JuliaDiffEq/DiffEqBenchmarks.jl.
Don't worry about making the full notebook. As long as there's some reasonably well-contained way to create the ODEProblem
(though it can be quite involved like in the Filament problem) then I can take it from there. At this point I have quite a system for putting it together so I can churn that part out pretty easily, and these are all run on the same computer (I should probably versioninfo()
at the bottom, but it's a 12 core Xeon so it should be pretty much between PC and cluster performance).
Excellent review, thank you @ChrisRackauckas. Also, I mention issue 14 for which I have no clue. This would enable to save from declaring a context at every call to lsoda
as can be done in the C version.
Oh, we've been doing that in the common interface for awhile. It's just about setting itask
just like in Sundials' CV_NORMAL
vs CV_ONE_STEP
. Of course, if you modify the context data at all you have to reset it because it's a multistep method so it has to know to "go back to Euler", and yes that is "expensive" in terms of it'll take more steps but it doesn't (shouldn't? I haven't checked for LSODA yet, but know for Sundials) require any allocations.
I know that in the Clib of lsoda
you can keep the context and make one step further keeping the same context, this is int lsoda(struct lsoda_context_t * ctx, double *y, double *t, double tout);
which is called repetedly here
Yes, look at the common interface.
@JuliaRegistrator register()
Registration pull request created: JuliaRegistries/General/3358
After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.
This will be done automatically if Julia TagBot is installed, or can be done manually through the github interface, or via:
git tag -a v0.6.0 -m "<description of version>" 5ea35ea80844b9a2c40bac3612cb8d9ccc5bb432
git push origin v0.6.0
Hey, I have some nice updates on this topic. The wrappers seem pretty optimal now. I've made sure as much inlines as possible and the allocations are all gone. GC handling is all fixed even with the pointer handling (hence the GC preserve statements). So I think LSODA.jl from the common interface is at its "good place" for the benchmarks to be precise. The following is a summary of the following benchmarks:
Non-stiff: http://benchmarks.juliadiffeq.org/html/NonStiffODE/linear_wpd.html http://benchmarks.juliadiffeq.org/html/NonStiffODE/ThreeBody_wpd.html http://benchmarks.juliadiffeq.org/html/NonStiffODE/Pleiades_wpd.html http://benchmarks.juliadiffeq.org/html/NonStiffODE/RigidBody_wpd.html http://benchmarks.juliadiffeq.org/html/NonStiffODE/FitzhughNagumo_wpd.html http://benchmarks.juliadiffeq.org/html/NonStiffODE/LotkaVolterra_wpd.html
http://benchmarks.juliadiffeq.org/html/StiffODE/VanDerPol.html http://benchmarks.juliadiffeq.org/html/StiffODE/ROBER.html http://benchmarks.juliadiffeq.org/html/StiffODE/Orego.html http://benchmarks.juliadiffeq.org/html/StiffODE/Hires.html http://benchmarks.juliadiffeq.org/html/StiffODE/Pollution.html
We thoroughly destroy LSODA in the non-stiff benchmarks, so there's not much to comment on there. The interesting ones are the stiff benchmarks. If the problem is small (so not Pollution i.e. <20 ODEs or so) and stiff, and you high tolerances (reltol >= 1e-3), then LSODA does very well. It's currently the best in that category if you are measuring only the error of the final point. If you're using the timeseries error, then sometimes Rosenbrock23 is able to match it in this domain. It hits a cliff the moment you want more than 3 digits of accuracy, and I think the OREGO results show it well:
but, that does mean that many of the small stiff example problems people use, with default tolerances, does best in LSODA. It's the only time it does the best, but it's a very important case 😄.
We have narrowed down that a lot of this behavior may be because of how OpenBLAS is handling small matrix factorizations. One of the big improvements on the DiffEq side in these small stiff benchmarks required stating:
using LinearAlgebra
LinearAlgebra.BLAS.set_num_threads(1)
since allowing OpenBLAS to use threads (the default) when factorizations are small is really really awful for performance. I wonder if we can skip OpenBLAS for small matrices and have a pure Julia basecase (since DiffEq is now using https://github.com/YingboMa/RecursiveFactorization.jl, we have a lot more control). @YingboMa let's look into the choice of https://github.com/YingboMa/RecursiveFactorization.jl/blob/master/src/lu.jl#L20 .
Anyways, both this BLAS thing and algorithmically (it doesn't scale because of its factorization choices anyways, but that's a bigger issue) are the reasons why LSODA doesn't seem to scale so well, but this BLAS thing interacting with the algorithm makes LSODA.jl perfect for these small cases and so we'll work to overcome this case 😅.
That's a great summary! Thank you!
I am looking forward to seeing an algo beating LSODA. I can only benefit from this :D
"I have some nice updates on this topic. The wrappers seem pretty optimal now. I've made sure as much inlines as possible and the allocations are all gone. GC handling is all fixed even with the pointer handling (hence the GC preserve statements). So I think LSODA.jl from the common interface is at its "good place" for the benchmarks to be precise."
Thank you a lot for this.
Hey @rveltz @sdwfrost
I wanted to let you know that this library got to a good enough point that I could run it through the suite of benchmarks. You can find the results uploaded at https://github.com/JuliaDiffEq/DiffEqBenchmarks.jl
A quick summary of how
lsoda
performs. It does not do very well on non-stiff problems and in that case Runge-Kutta methods are consistently performing better. However, at high tolerance for stiff problems it performs very well, among the best and usually more efficient than Sundials'CVODE_BDF
for small problems (for large problems, or with an expensivef
,CVODE_BDF
is the most efficient). When tolerances get lowerlsoda
tends to have weird behavior and its time diverges to infinity though.The sweet spot on stiff problems seems to be about:
where in it
lsoda
does quite well.Just thought I'd share it with you. @sdwfrost mentioned that he has some problems where it does extremely well, so I was wondering if you'd be willing to share that problem to add it to the set of benchmarks. Looking at these, it seems stiffness detection and switching is not worthwhile when problems are non-stiff, but very much worthwhile when they are stiff!
Because of these benchmarks, I will be taking the time to add the integrator interface and callbacks to
lsoda
. I wouldn't expect it to do that well if there's lots of events (since it's a multistep method of course), but for low numbers of events it's likely quite good. I hope LSODA.jl can join the standard DiffEq grouping sometime in the future, though I understand if it cannot (though I will give you JuliaDiffEq membership and allow you to retain admin status of the repo. It would mostly be for maintenance and QC reasons).