YingboMa / FastBroadcast.jl

MIT License
76 stars 6 forks source link

add thread=(true/false) kwarg #19

Closed chriselrod closed 3 years ago

chriselrod commented 3 years ago

It is allocating for some reason:

julia> x = rand(8192*4); y = similar(x);

julia> @benchmark @. $y = log($x)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  217.880 μs … 249.976 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     228.026 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   227.896 μs ±   1.139 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                     ▄▄           ▁█▅     ▂▂    ▁
  ▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅██▄▁▁▁▃▇█▅▃▄▁████▁▁▁▆███▇▇ █
  218 μs        Histogram: log(frequency) by time        230 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark @.. $y = log($x)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  212.818 μs … 297.649 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     227.193 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   227.270 μs ±   1.778 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                            ▃         ▅█▂  ▁▃▂  ▁
  ▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇█▃▁▁▁▆▅▃▃▃███▃▁███▇ █
  213 μs        Histogram: log(frequency) by time        229 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark @.. thread=true $y = log($x)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  13.383 μs … 162.282 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     13.693 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   13.968 μs ±   2.754 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃██▅▁                                                        ▂
  █████▄▃▁▁▁▁▁▁▃▄▄▆▆▆▄▃▃▁▁▁▃▁▄▃▁▁▁▁▁▁▁▁▃▃▁▅▁▃▃▃▁▄▁▁▃▃▅▅▆▆▅▃▅▇█ █
  13.4 μs       Histogram: log(frequency) by time      22.3 μs <

 Memory estimate: 48 bytes, allocs estimate: 1.

I'm taking the approach of batch-ing over fast_broadcast! calls to avoid closures inside @generated. I'm making the ::True/::False the first argument of fast_broadcast! to avoid a second branch in the call constructor with in a macro. It's an internal method (I assume), so I figured it isn't too important. But I could make the change if it's considered poor form.

codecov-commenter commented 3 years ago

Codecov Report

Merging #19 (a6761c4) into master (728dba6) will decrease coverage by 0.54%. The diff coverage is 84.61%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #19      +/-   ##
==========================================
- Coverage   90.29%   89.74%   -0.55%     
==========================================
  Files           1        1              
  Lines         206      234      +28     
==========================================
+ Hits          186      210      +24     
- Misses         20       24       +4     
Impacted Files Coverage Δ
src/FastBroadcast.jl 89.74% <84.61%> (-0.55%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 728dba6...a6761c4. Read the comment docs.