Open esc opened 4 years ago
We do want compile time. It can make a fresh dispatcher and run the .compile
method explicitly
I have update the tests to seed the RNG and to time compilation. Here is snapshot of a run on my machine:
[ 56.25%] ··· bench_typed_list.ConstructionSuite.time_construct_from_python_list 96.9±2ms
[ 62.50%] ··· bench_typed_list.ConstructionSuite.time_construct_in_njit_function 2.17±0.09ms
[ 68.75%] ··· bench_typed_list.ReductionSuite.time_compile_reduction_sum_fastmath 21.1±0.4μs
[ 75.00%] ··· bench_typed_list.ReductionSuite.time_compile_reduction_sum_no_fastmath 21.8±0.1μs
[ 81.25%] ··· bench_typed_list.ReductionSuite.time_execute_reduction_sum_fastmath 195±6μs
[ 87.50%] ··· bench_typed_list.ReductionSuite.time_execute_reduction_sum_no_fastmath 201±2μs
[ 93.75%] ··· bench_typed_list.SortSuite.time_compile_sort 216±3ms
[100.00%] ··· bench_typed_list.SortSuite.time_execute_sort 116±3ms
With recent updates, these are the current benchmarks for the changes introduced by: https://github.com/numba/numba/pull/6278
All benchmarks:
before after ratio
[3b3eab89] [05ce51c6]
<pull/5543/merge~1> <pull/6278/head~1>
103±3ms 103±3ms 1.00 bench_typed_list.ConstructionSuite.time_construct_from_python_list
2.37±0.09ms 2.39±0.07ms 1.01 bench_typed_list.ConstructionSuite.time_construct_in_njit_function
3.603527119826277e-05±4e-06 2.7983062694041507e-05±9.5e-07 ~0.78 bench_typed_list.ForLoopReductionSuite.time_compile_reduction_sum_fastmath
4.161973561003964e-05±9.1e-06 2.9423185928547246e-05±3.7e-06 ~0.71 bench_typed_list.ForLoopReductionSuite.time_compile_reduction_sum_no_fastmath
0.0035349028767086565±0.00034 0.0025243946991395207±2e-05 ~0.71 bench_typed_list.ForLoopReductionSuite.time_execute_reduction_sum_fastmath
0.0031963562505552545±0.00021 0.0027488698993693105±0.0002 ~0.86 bench_typed_list.ForLoopReductionSuite.time_execute_reduction_sum_no_fastmath
62.6±2ms 62.7±2ms 1.00 bench_typed_list.ForLoopReductionSuiteFloat.time_compile_reduction_sum_fastmath
61.9±1ms 61.2±0.7ms 0.99 bench_typed_list.ForLoopReductionSuiteFloat.time_compile_reduction_sum_no_fastmath
2.59±0.03ms 2.52±0.07ms 0.97 bench_typed_list.ForLoopReductionSuiteFloat.time_execute_reduction_sum_fastmath
2.58±0.04ms 2.50±0.04ms 0.97 bench_typed_list.ForLoopReductionSuiteFloat.time_execute_reduction_sum_no_fastmath
29.1±0.8μs 28.0±0.6μs 0.96 bench_typed_list.ForLoopReductionSuiteInt.time_compile_reduction_sum_fastmath
29.3±0.5μs 31.6±0.4μs 1.08 bench_typed_list.ForLoopReductionSuiteInt.time_compile_reduction_sum_no_fastmath
2.64±0.1ms 2.55±0.04ms 0.97 bench_typed_list.ForLoopReductionSuiteInt.time_execute_reduction_sum_fastmath
2.69±0.1ms 2.47±0.07ms 0.92 bench_typed_list.ForLoopReductionSuiteInt.time_execute_reduction_sum_no_fastmath
2.6585843983129173e-05±2.1e-06 2.2718447455970037e-05±2.8e-07 ~0.85 bench_typed_list.IteratorReductionSuite.time_compile_reduction_sum_fastmath
2.4821427593867003e-05±2.2e-06 2.3370140742376312e-05±3.2e-07 0.94 bench_typed_list.IteratorReductionSuite.time_compile_reduction_sum_no_fastmath
0.00023403224115879442±5.1e-06 0.00021722948935967772±3.8e-06 0.93 bench_typed_list.IteratorReductionSuite.time_execute_reduction_sum_fastmath
0.0002068479817932133±1.5e-05 0.00022312129993224517±7e-06 1.08 bench_typed_list.IteratorReductionSuite.time_execute_reduction_sum_no_fastmath
+ 23.3±0.8μs 26.7±0.9μs 1.14 bench_typed_list.IteratorReductionSuiteFloat.time_compile_reduction_sum_fastmath
- 27.0±0.8μs 24.4±0.3μs 0.90 bench_typed_list.IteratorReductionSuiteFloat.time_compile_reduction_sum_no_fastmath
227±20μs 271±4μs ~1.19 bench_typed_list.IteratorReductionSuiteFloat.time_execute_reduction_sum_fastmath
231±20μs 246±10μs 1.07 bench_typed_list.IteratorReductionSuiteFloat.time_execute_reduction_sum_no_fastmath
22.3±0.4μs 26.1±0.7μs ~1.17 bench_typed_list.IteratorReductionSuiteInt.time_compile_reduction_sum_fastmath
+ 22.8±0.5μs 27.2±0.2μs 1.19 bench_typed_list.IteratorReductionSuiteInt.time_compile_reduction_sum_no_fastmath
212±7μs 235±4μs ~1.11 bench_typed_list.IteratorReductionSuiteInt.time_execute_reduction_sum_fastmath
198±5μs 231±4μs ~1.17 bench_typed_list.IteratorReductionSuiteInt.time_execute_reduction_sum_no_fastmath
n/a n/a n/a bench_typed_list.ReductionSuite.time_compile_reduction_sum_fastmath
n/a n/a n/a bench_typed_list.ReductionSuite.time_compile_reduction_sum_no_fastmath
n/a n/a n/a bench_typed_list.ReductionSuite.time_execute_reduction_sum_fastmath
n/a n/a n/a bench_typed_list.ReductionSuite.time_execute_reduction_sum_no_fastmath
235±6ms 233±5ms 0.99 bench_typed_list.SortSuite.time_compile_sort
120±2ms 113±2ms 0.94 bench_typed_list.SortSuite.time_execute_sort
This is an intial stab at the ASV tests for the
numba.typed.List
.Things that still need to be decided on:
Lastly, here is a snapshot of what it looks like when run on my laptop: