Jutho / Strided.jl

A Julia package for strided array views and efficient manipulations thereof
Other
150 stars 13 forks source link

Cannot use multithreading when access fields in a struct #10

Closed shipengcheng1230 closed 4 years ago

shipengcheng1230 commented 4 years ago

Hi, consider the following MWE:

julia> Threads.nthreads()
4

julia> using Strided, BenchmarkTools

julia> A = randn(100, 100);

julia> struct BB{T}
           b::T
       end

julia> bb = BB(rand(100));

julia> ff(a::T, b::T) where T = @fastmath a * log(abs(b))
ff (generic function with 1 method)

julia> @btime @strided ff.($A, $bb.b); # cpu usage always 100%
  117.046 μs (5 allocations: 78.34 KiB)

julia> @btime let b = bb.b # cpu usage can exceed 100%
           @strided ff.($A, b)
       end;
  50.644 μs (46 allocations: 82.14 KiB)

Also:

julia> versioninfo()
Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 4

So when passing one of the fields in a user-defined struct, it stops to use multithreading. Same thing happens on a Linux machine. Do you have any idea why this happens? Thank you!

Jutho commented 4 years ago

Yes apparently field access is not managed correctly by the @strided macro. I will try to fix this asap.

Jutho commented 4 years ago

Can you test with the latest version of Strided.jl (0.3.4). This problem should have been resolved.

shipengcheng1230 commented 4 years ago

The latest version works! Thank you!

julia> @btime @strided ff.($A, $bb.b);
  47.551 μs (40 allocations: 81.89 KiB)

But I found that the fusion @. isn't supported (Should this be another issue page?). Following the example above:

julia> c = randn(100, 100);

julia> @btime @strided c .= ff.(A, bb.b); # works!
  47.806 μs (42 allocations: 3.84 KiB)

julia> @btime @strided @. c = ff(A, bb.b); # should work as above?
  94.127 μs (8 allocations: 144 bytes)

julia> @btime @. c = ff(A, bb.b); # without multithreading
  97.249 μs (2 allocations: 48 bytes)
Jutho commented 4 years ago

Yes these two macros expand differently, I will have to investigate why:

julia> ex=:(@strided @. c = ff(A, bb.b))
:(#= REPL[11]:1 =# @strided #= REPL[11]:1 =# @__dot__(c = ff(A, bb.b)))

julia> macroexpand(Main, ex)
:(c .= Strided.maybeunstrided.(ff.(Strided.maybestrided.(A), Strided.maybestrided.(bb.b))))

julia> ex = :(@strided c .= ff.(A, bb.b))
:(#= REPL[13]:1 =# @strided c .= ff.(A, bb.b))

julia> macroexpand(Main, ex)
:(Strided.maybestrided(c) .= (Strided.maybestrided(ff)).(Strided.maybestrided(A), Strided.maybestrided(bb.b)))
Jutho commented 4 years ago

This makes sense, I guess @strided sees the expression with the @. not expanded, so it would need to expand it itself. I will have to refresh my understanding of macros to know how to do that correctly.

mcabbott commented 4 years ago

I just ran into this @. issue, and came here to wonder about fixing it. I think it's as simple as inserting ex = macroexpand(__module__, ex) as the first line of each macro.

Jutho commented 4 years ago

Thanks for the fix @mcabbott .