Closed oscarbenjamin closed 4 years ago
Can you paste the timings for these? We should try to keep each benchmark under a second at least.
These benchmarks are for things that are made much faster by sympy/sympy#18844.
With that PR they are fast (or will be when it is finished) so I guess I'm adding them as aspirational benchmarks.
If things are too slow for previous versions, I think that it can should be blacklisted for previous versions and only test for any regressions after.
I guess that the benchmarks need to work on current master when they get merged. I've scaled them back a bit. These are the new timings:
· Creating environments
· Discovering benchmarks
· Running 10 total benchmarks (1 commits * 1 environments * 10 benchmarks)
[ 0.00%] · For sympy commit ce1e53f0:
[ 0.00%] ·· Benchmarking virtualenv-py3.8-fastcache-mpmath-numpy
[ 5.00%] ··· Running (solve.TimeRationalSystem.time_linsolve--)..........
[ 55.00%] ··· solve.TimeRationalSystem.time_linsolve ok
[ 55.00%] ··· ======== =============
param1
-------- -------------
1 793±40μs
3 1.78±0.03ms
5 3.56±0.2ms
10 28.1±0.5ms
======== =============
[ 60.00%] ··· solve.TimeRationalSystem.time_solve ok
[ 60.00%] ··· ======== =============
param1
-------- -------------
1 1.18±0.06ms
3 5.93±0.2ms
5 14.5±0.04ms
10 76.5±0.4ms
======== =============
[ 65.00%] ··· solve.TimeRationalSystemSymbol.time_linsolve ok
[ 65.00%] ··· ======== =============
param1
-------- -------------
1 6.03±0.04ms
3 43.7±0.9ms
5 112±4ms
======== =============
[ 70.00%] ··· solve.TimeRationalSystemSymbol.time_solve ok
[ 70.00%] ··· ======== =============
param1
-------- -------------
1 1.28±0.01ms
3 6.66±0.2ms
5 15.1±0.1ms
======== =============
[ 75.00%] ··· solve.TimeSolveSparsePolySystem.time_solve ok
[ 75.00%] ··· ======== ============
param1
-------- ------------
1 16.0±0.2ms
2 186±2ms
3 567±3ms
======== ============
[ 80.00%] ··· solve.TimeSparseSystem.time_linear_eq_to_matrix ok
[ 80.00%] ··· ======== ============
param1
-------- ------------
10 4.15±0.1ms
20 15.3±0.6ms
30 33.3±0.4ms
======== ============
[ 85.00%] ··· solve.TimeSparseSystem.time_linsolve_Aaug ok
[ 85.00%] ··· ======== ============
param1
-------- ------------
10 15.7±0.4ms
20 54.0±0.2ms
30 116±0.5ms
======== ============
[ 90.00%] ··· solve.TimeSparseSystem.time_linsolve_Ab ok
[ 90.00%] ··· ======== ============
param1
-------- ------------
10 15.8±0.5ms
20 53.1±0.4ms
30 114±0.6ms
======== ============
[ 95.00%] ··· solve.TimeSparseSystem.time_linsolve_eqs ok
[ 95.00%] ··· ======== ============
param1
-------- ------------
10 20.6±0.2ms
20 70.1±6ms
30 153±0.7ms
======== ============
[100.00%] ··· solve.TimeSparseSystem.time_solve ok
[100.00%] ··· ======== ============
param1
-------- ------------
10 58.8±0.8ms
20 315±3ms
30 707±8ms
======== ============
The reason some of these examples fail with larger benchmarks is because of hidden quadratic behaviour that only comes into play with larger systems of equations. My PR eliminates those but it's hard to see that effect if you only look at smaller systems.
I think that it can should be blacklisted for previous versions
Is there a mechanism for blacklisting benchmarks?
https://asv.readthedocs.io/en/stable/writing_benchmarks.html#setup-and-teardown-functions asv uses notimplementderror in setup for skipping the benchmark
That's for make the timings faster. LGTM.
I think we should have a range of time lengths. Many benchmarks naturally have a parameter that makes them take longer so this isn't difficult.
The problem is that if a benchmark is too fast, then it can end up timing something else, like the overhead of the core or some unrelated helper functions. It can also be harder to detect speedups or slowdowns. For instance, if a function has 500 ms of overhead and spends 500 ms on the actual algorithm, then if the algorithm becomes 50% faster, it will only show up as 25% (500 + 250 = 750 ms). But if you instead take a version that takes 4 sec on the main algorithm you will see something much closer to 50% (4.5 vs. 2.5 ms.).
But I also sympathize with the fact that this makes the benchmarks much slower to run, so there needs to be a balance, and a good way to run such things optionally (I don't know if there is anything like that right now). Plus sometimes you do want to detect slowdowns in overhead stuff as well, so having both types of benchmarks is helpful.
I'll merge this now. Once the behaviour on sympy master is actually improved I'll scale up the benchmarks and add blacklisting.
Adds benchmarks for solve and linsolve for a few cases that came up while working on https://github.com/sympy/sympy/pull/18844: