JuliaDynamics / ResumableFunctions.jl

C# style generators a.k.a. semi-coroutines for Julia.
Other
160 stars 19 forks source link

Benchmarking versus Tasks/Channels in Base #35

Closed non-Jedi closed 5 years ago

non-Jedi commented 5 years ago

I didn't trust the assertion in the README about the relative performance of resumable functions and Tasks/Channels, so I ran some simple benchmarks. I thought they might be a good idea to include in the README or documentation:

julia> using BenchmarkTools, ResumableFunctions

julia> @resumable function resumable_fib(n::Int)::Int
         a = 0
         b = 1
         for i in 1:n
           @yield a
           a, b = b, a+b
         end
       end
resumable_fib (generic function with 1 method)

julia> function fib(c, n)
         a = 0
         b = 1
         for i in 1:n
           put!(c, a)
           a, b = b, a+b
         end end
fib (generic function with 1 method)

julia> function fib_wrapper(n)
         # This means we're doing equivalent work to resumable in constructing the iterator
         Channel(c -> fib(c, n); ctype=Int, csize=0)
       end
fib_wrapper (generic function with 1 method)

julia> struct FibN
         n::Int
       end

julia> function Base.iterate(f::FibN, state=(0,1,1))
         @inbounds last(state) === f.n && return nothing
         @inbounds state[2], (state[2], state[1] + state[2], state[3]+1)
       end

julia> sum(resumable_fib(1000))
9079565065540428012

julia> sum(fib_wrapper(1000))
9079565065540428012

julia> sum(FibN(1000))
9079565065540428012

julia> @benchmark sum(resumable_fib($10000))
BenchmarkTools.Trial: 
  memory estimate:  937.83 KiB
  allocs estimate:  30007
  --------------
  minimum time:     11.310 ms (0.00% GC)
  median time:      11.472 ms (0.00% GC)
  mean time:        11.970 ms (2.06% GC)
  maximum time:     63.081 ms (81.33% GC)
  --------------
  samples:          418
  evals/sample:     1

julia> @benchmark sum(fib_wrapper($10000))
BenchmarkTools.Trial: 
  memory estimate:  470.58 KiB
  allocs estimate:  30016
  --------------
  minimum time:     25.384 ms (0.00% GC)
  median time:      25.763 ms (0.00% GC)
  mean time:        26.636 ms (1.68% GC)
  maximum time:     86.521 ms (69.75% GC)
  --------------
  samples:          188
  evals/sample:     1

julia> @benchmark sum(FibN($10000))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     5.649 μs (0.00% GC)
  median time:      5.811 μs (0.00% GC)
  mean time:        5.953 μs (0.00% GC)
  maximum time:     16.782 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     6

julia> r = resumable_fib(200000)
getfield(Main, Symbol("##364"))(0x00, 1, 102404, 200000, 1, 140027874456368:140027767251840, nothing, 0)

julia> c = fib_wrapper(200000)
Channel{Int64}(sz_max:0,sz_curr:1)

julia> f = FibN(200000)
FibN(200000)

julia> @time sum(r)
  0.235206 seconds (600.01 k allocations: 18.311 MiB, 1.58% gc time)
-7071068302198173471

julia> @time sum(c)
  0.587648 seconds (600.00 k allocations: 9.155 MiB, 9.07% gc time)
-7071068302198173471

julia> @time sum(f)
  0.000422 seconds (5 allocations: 176 bytes)
-7071068302198173471

In general, ResumableFunctions.jl seems to be about 2 times faster than doing the equivalent using Channels. Interestingly, if I change Channel csize to 1 instead of 0, the discrepancy is closer to 4-5 times than 2. And increasing csize to 10 adds several orders of magnitude to the runtime.

Also, obviously, neither holds a candle to just using the vanilla iteration interface with a custom type. Both are 3 orders of magnitude slower than that option.

❯ julia --version
julia version 1.1.1

I'm curious. Has anyone looked into why Channels are so slow as iterators and whether they can be improved to be at least on par with ResumableFunctions?

non-Jedi commented 5 years ago

Looking over the cases in the benchmark folder that I hadn't spotted before, not all of those are doing the same thing, and because nothing is being done with the output, both test_resumable and test_closure_stm are getting completely optimized away (maybe others as well; didn't look that closely).


julia> @code_native test_resumable()
        .text
; ┌ @ benchmarks.jl:31 within `test_resumable'
        movq    %rsi, -8(%rsp)
; │ @ benchmarks.jl:33 within `test_resumable'
        movabsq $140230517973000, %rax  # imm = 0x7F89F635E008
        retq
; └

julia> @code_native test_closure_stm()
        .text
; ┌ @ benchmarks.jl:128 within `test_closure_stm'
        movq    %rsi, -8(%rsp)
; │ @ benchmarks.jl:130 within `test_closure_stm'
        movabsq $140230517973000, %rax  # imm = 0x7F89F635E008
        retq
; └