JuliaFolds / Transducers.jl

Efficient transducers for Julia
https://juliafolds.github.io/Transducers.jl/dev/
MIT License
432 stars 24 forks source link

Tail-call function-barrier for arrays #403

Closed tkf closed 4 years ago

tkf commented 4 years ago

Commit Message

Tail-call function-barrier for arrays (#403)

This PR implements the "tail-call function-barrier" pattern for arrays with linear style indexing. This gives us a better performance for type-changing reduction where the iteration at which the type changes is unknown. A good example is filtered sum with no initial value (#404).

Interestingly, this version is a bit worse for sum of filtered floats on "short" arrays. Compare the result of ["filter_sum", "1000", "RandomFloats", "noinit", "xf"]. In my laptop, baseline is ~1.7 μs while this branch is ~2.1 μs. However, for 10x longer input, this branch is faster (> 2x). The benchmarks on CI also show similar results.

codecov[bot] commented 4 years ago

Codecov Report

Merging #403 into master will increase coverage by 0.06%. The diff coverage is 96.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #403      +/-   ##
==========================================
+ Coverage   90.76%   90.82%   +0.06%     
==========================================
  Files          25       25              
  Lines        1559     1581      +22     
==========================================
+ Hits         1415     1436      +21     
- Misses        144      145       +1     
Impacted Files Coverage Δ
src/processes.jl 92.85% <94.73%> (-0.12%) :arrow_down:
src/basics.jl 90.62% <100.00%> (+2.62%) :arrow_up:
src/core.jl 85.94% <100.00%> (+0.23%) :arrow_up:
src/library.jl 93.15% <100.00%> (+0.02%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 65db24b...ad7de66. Read the comment docs.

tkf commented 4 years ago
ID time ratio memory ratio
["collect", "identity-float"] 1.11 (5%) :x: 1.00 (1%)
["collect", "identity-union"] 0.48 (5%) :white_check_mark: 0.85 (1%) :white_check_mark:
["filter_sum", "1000", "RandomFloats", "noinit", "xf"] 1.33 (5%) :x: 1.00 (1%)
["filter_sum", "1000", "UnitRange", "noinit", "xf"] 0.11 (5%) :white_check_mark: 1.00 (1%)
["filter_sum", "10000", "RandomFloats", "noinit", "xf"] 0.37 (5%) :white_check_mark: 1.00 (1%)
["filter_sum", "10000", "UnitRange", "noinit", "xf"] 0.11 (5%) :white_check_mark: 1.00 (1%)
["gemm", "mul", "man", "false", "8"] 1.30 (5%) :x: 1.00 (1%)
["gemm", "mul", "man", "ivdep", "8"] 1.29 (5%) :x: 1.00 (1%)
["gemm", "mul", "man", "true", "8"] 1.35 (5%) :x: 1.00 (1%)
["gemm", "mul", "xf", "false", "8"] 1.33 (5%) :x: 1.00 (1%)
["gemm", "mul", "xf", "ivdep", "8"] 1.31 (5%) :x: 1.00 (1%)
["gemm", "mul", "xf", "true", "8"] 1.33 (5%) :x: 1.00 (1%)
["missing_argmax", "rf"] 0.60 (5%) :white_check_mark: 1.00 (1%)
["missing_argmax", "xf"] 0.58 (5%) :white_check_mark: 1.00 (1%)
["missing_dot", "rf"] 0.86 (5%) :white_check_mark: 1.00 (1%)
["overhead", "foldl"] 0.93 (5%) :white_check_mark: 1.00 (1%)

--- https://github.com/JuliaFolds/Transducers-data/blob/benchmark-results/2020/08/09/065111/result.md#results