Closed giordano closed 2 years ago
Thanks for filing the issue. That's an interesting structure. I'll try to submit a fix later today. Would be great if you could then test it (I don't have access to Fugaku).
@giordano What do you get for Sys.CPU_THREADS
on the system?
I seem to remember it's 48, but can tell you for sure later today.
Yeah, I wonder whether it's 50 or 48. (I would expect 50.)
I've updated gather_sysinfo_lscpu
on ThreadPinning.jl#main
to support FUGAKU. Feel free to check it out and confirm that threadinfo
works. Of course, it would be great if you could test the other functionality (like pinning) as well. Perhaps even run ] test
on Fugaku.
Ok, I was wrong:
julia> Sys.CPU_THREADS
50
julia> Threads.nthreads()
48
julia> threadinfo(; color=false)
| _,_,12,13,14,15,16,17,18,19,20,21,22,23,24,25,
26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,
42,43,_,45,46,47,48,49,50,51,52,53,54,55,56,57,
58,59 |
# = Julia thread, | = Socket seperator
Julia threads: 48
├ Occupied CPU-threads: 47(!)
└ Mapping (Thread => CPUID): 1 => 14, 2 => 57, 3 => 18, 4 => 20, 5 => 16, ...
However:
julia> pinthreads(:compact)
ERROR: BoundsError: attempt to access 50-element Vector{Bool} at index [51]
Stacktrace:
[1] getindex
@ ./array.jl:861 [inlined]
[2] ishyperthread
@ /data/ra000019/a04463/julia-depot/packages/ThreadPinning/o0INJ/src/querying.jl:45 [inlined]
[3] #84
@ ./operators.jl:1117 [inlined]
[4] filter(f::Base.var"#84#85"{typeof(ishyperthread)}, a::Vector{Int64})
@ Base ./array.jl:2484
[5] _broadcast_getindex_evalf
@ ./broadcast.jl:670 [inlined]
[6] _broadcast_getindex
@ ./broadcast.jl:643 [inlined]
[7] getindex
@ ./broadcast.jl:597 [inlined]
[8] macro expansion
@ ./broadcast.jl:961 [inlined]
[9] macro expansion
@ ./simdloop.jl:77 [inlined]
[10] copyto!
@ ./broadcast.jl:960 [inlined]
[11] copyto!
@ ./broadcast.jl:913 [inlined]
[12] copy
@ ./broadcast.jl:885 [inlined]
[13] materialize
@ ./broadcast.jl:860 [inlined]
[14] _pin_compact(nthreads::Int64; hyperthreads::Bool)
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/o0INJ/src/pinning.jl:93
[15] _pin_compact
@ /data/ra000019/a04463/julia-depot/packages/ThreadPinning/o0INJ/src/pinning.jl:91 [inlined]
[16] pinthreads(strategy::Symbol; nthreads::Int64, warn::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/o0INJ/src/pinning.jl:64
[17] pinthreads(strategy::Symbol)
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/o0INJ/src/pinning.jl:61
[18] top-level scope
@ REPL[9]:1
julia> pinthreads(:spread)
ERROR: ArgumentError: All cpuids must be ≤ Sys.CPU_THREADS-1 and ≥ 0.
Stacktrace:
[1] pinthreads(cpuids::SubArray{Int64, 1, Vector{Int64}, Tuple{UnitRange{Int64}}, true}; warn::Bool)
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/o0INJ/src/pinning.jl:38
[2] _pin_scatter
@ /data/ra000019/a04463/julia-depot/packages/ThreadPinning/o0INJ/src/pinning.jl:103 [inlined]
[3] pinthreads(strategy::Symbol; nthreads::Int64, warn::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/o0INJ/src/pinning.jl:66
[4] pinthreads(strategy::Symbol)
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/o0INJ/src/pinning.jl:61
[5] top-level scope
@ REPL[10]:1
Side note, I think here I want to pin to the last threads, not the first ones, which should be reserved to the OS.
Ok, thanks for the update, I'll try my best to fix this from the far 😄
Side note, I think here I want to pin to the last threads, not the first ones, which should be reserved to the OS.
You can always provide an AbstractVector
with CPU IDs to pinthreads
, e.g. pinthreads(2:49)
(CPU IDs start with 0). If we want to make the automatic pinthreads(:compact)
(and similarly pinthreads(:spread)
) work we would need to special case Fugaku (by hostname) or, probably better, the CPU (by name).
Alright, I've pushed fixes to the main branch. Please try them out. (I haven't special cased Fugaku / the CPU yet though so :compact
will still use the first two cores.)
Ugh, now I get:
julia> threadinfo(; color=false)
ERROR: BoundsError: attempt to access 6-element Vector{Vector{Int64}} at index [7]
Stacktrace:
[1] getindex(A::Vector{Vector{Int64}}, i1::Int64)
@ Base ./array.jl:861
[2] gather_sysinfo_lscpu(lscpustr::Nothing; verbose::Bool)
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/Mt6Gv/src/utility.jl:156
[3] maybe_gather_sysinfo(lscpustr::Nothing; force::Bool, verbose::Bool)
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/Mt6Gv/src/utility.jl:73
[4] maybe_gather_sysinfo (repeats 2 times)
@ /data/ra000019/a04463/julia-depot/packages/ThreadPinning/Mt6Gv/src/utility.jl:71 [inlined]
[5] threadinfo(; blas::Bool, hints::Bool, color::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/Mt6Gv/src/threadinfo.jl:13
[6] top-level scope
@ REPL[6]:1
BTW, I think pinthreads(2:49)
would fine here, this is a somewhat special CPU, it may not be worth special-case it.
Hm, that's weird I didn't even change that part (at least not intentionally). Can you post the output of ThreadPinning.gather_sysinfo_lscpu(;verbose=true)
?
BTW, I think
pinthreads(2:49)
would fine here, this is a somewhat special CPU, it may not be worth special-case it.
It would be pinthreads(12:59)
since, for whatever reason, the CPU threads are numbered 0,1,12,13,...,59. See your lscpu
output or cpuids_all()
.
julia> ThreadPinning.gather_sysinfo_lscpu(;verbose=true)
online_cpu_tblidcs = [2, 3, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61]
cpuids = Any[0, 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]
hyperthreading = false
nsockets = 1
nnuma = 6
ThreadPinning.SysInfo
├ nsockets: 1
├ nnuma: 6
├ hyperthreading: false
├ cpuids: [0, 1, 12, 13, 14, 15, 16, 17, 18, 19 … 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]
├ cpuids_sockets: [[0, 1, 12, 13, 14, 15, 16, 17, 18, 19 … 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]]
├ cpuids_numa: [[0], [1], [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]]
└ ishyperthread: Bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Ok, so ThreadPinning.gather_sysinfo_lscpu(;verbose=true)
works just fine but threadinfo(; color=false)
fails? I'm asking once more because the stacktrace suggests that the error is caused by
[2] gather_sysinfo_lscpu(lscpustr::Nothing; verbose::Bool)
@ ThreadPinning /data/ra000019/a04463/julia-depot/packages/ThreadPinning/Mt6Gv/src/utility.jl:156
@giordano I found a typo which could have caused this (though I still don't understand why a direct call succeeds). Please try the new main.
It seems to be working now with latest changes:
julia> ThreadPinning.gather_sysinfo_lscpu(;verbose=true)
online_cpu_tblidcs = [2, 3, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61]
cpuids = Any[0, 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]
hyperthreading = false
nsockets = 1
nnuma = 6
ThreadPinning.SysInfo
├ nsockets: 1
├ nnuma: 6
├ hyperthreading: false
├ cpuids: [0, 1, 12, 13, 14, 15, 16, 17, 18, 19 … 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]
├ cpuids_sockets: [[0, 1, 12, 13, 14, 15, 16, 17, 18, 19 … 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]]
├ cpuids_numa: [[0], [1], [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35], [36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]]
└ ishyperthread: Bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
julia> threadinfo(; color=false)
| _,_,12,13,14,15,16,17,18,19,20,21,22,23,24,25,
26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,
42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,
58,59 |
# = Julia thread, | = Socket seperator
Julia threads: 48
├ Occupied CPU-threads: 48
└ Mapping (Thread => CPUID): 1 => 12, 2 => 33, 3 => 13, 4 => 14, 5 => 16, ...
julia> pinthreads(:spread)
julia> threadinfo(; color=false)
| _,_,12,13,14,15,16,17,18,19,20,21,22,23,24,25,
26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,
42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,
_,_ |
# = Julia thread, | = Socket seperator
Julia threads: 48
├ Occupied CPU-threads: 46(!)
└ Mapping (Thread => CPUID): 1 => 14, 2 => 33, 3 => 12, 4 => 13, 5 => 14, ...
julia> pinthreads(:compact)
julia> threadinfo(; color=false)
| _,_,12,13,14,15,16,17,18,19,20,21,22,23,24,25,
26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,
42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,
_,_ |
# = Julia thread, | = Socket seperator
Julia threads: 48
├ Occupied CPU-threads: 46(!)
└ Mapping (Thread => CPUID): 1 => 14, 2 => 33, 3 => 12, 4 => 13, 5 => 14, ...
julia> threadinfo(; color=false)
| _,_,12,13,14,15,16,17,18,19,20,21,22,23,24,25,
26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,
42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,
58,59 |
# = Julia thread, | = Socket seperator
Julia threads: 48
├ Occupied CPU-threads: 48
└ Mapping (Thread => CPUID): 1 => 12, 2 => 13, 3 => 14, 4 => 15, 5 => 16, ...
Thanks!
FYI: I tagged a new release (0.4.3) with these changes.
For the record, according to the datasheet, each node has 48 compute cores + 2 or 4 assistant cores for the operating system, my guess is that here 0-1 are the assistant cores and 2-49 are the compute ones.