Open hsgg opened 3 years ago
Just to follow up, this problem does not occur if I pin VectorizationBase
to version 0.12.
I think it's a Hwloc bug, but I don't know if it's supposed to work.
VectorizationBase 0.12 used CpuId.jl instead. Maybe I should use both, with CpuId serving as a backup when Hwloc doesn't work. =/
FWIW, it should look more like this:
julia> VectorizationBase.CACHE_COUNT
(18, 18, 1, 0)
julia> VectorizationBase.COUNTS
Dict{Symbol, Int64} with 19 entries:
:L3Cache => 1
:I2Cache => 0
:Package => 1
:Machine => 1
:I3Cache => 0
:PU => 36
:PCI_Device => 0
:OS_Device => 0
:Error => 0
:L2Cache => 18
:NUMANode => 0
:Bridge => 0
:L5Cache => 0
:Group => 0
:Misc => 0
:L1Cache => 18
:L4Cache => 0
:I1Cache => 0
:Core => 18
julia> VectorizationBase.TOPOLOGY
D0: L0 P0 Machine
D1: L0 P0 Package
D2: L0 P-1 L3Cache Cache{size=25952256,depth=3,linesize=64,associativity=11,type=Unified}
D3: L0 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L0 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L0 P0 Core
D6: L0 P0 PU
D6: L1 P18 PU
D3: L1 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L1 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L1 P1 Core
D6: L2 P1 PU
D6: L3 P19 PU
D3: L2 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L2 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L2 P2 Core
D6: L4 P2 PU
D6: L5 P20 PU
D3: L3 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L3 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L3 P3 Core
D6: L6 P3 PU
D6: L7 P21 PU
D3: L4 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L4 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L4 P4 Core
D6: L8 P4 PU
D6: L9 P22 PU
D3: L5 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L5 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L5 P8 Core
D6: L10 P5 PU
D6: L11 P23 PU
D3: L6 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L6 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L6 P9 Core
D6: L12 P6 PU
D6: L13 P24 PU
D3: L7 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L7 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L7 P10 Core
D6: L14 P7 PU
D6: L15 P25 PU
D3: L8 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L8 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L8 P11 Core
D6: L16 P8 PU
D6: L17 P26 PU
D3: L9 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L9 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L9 P16 Core
D6: L18 P9 PU
D6: L19 P27 PU
D3: L10 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L10 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L10 P17 Core
D6: L20 P10 PU
D6: L21 P28 PU
D3: L11 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L11 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L11 P18 Core
D6: L22 P11 PU
D6: L23 P29 PU
D3: L12 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L12 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L12 P19 Core
D6: L24 P12 PU
D6: L25 P30 PU
D3: L13 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L13 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L13 P20 Core
D6: L26 P13 PU
D6: L27 P31 PU
D3: L14 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L14 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L14 P24 Core
D6: L28 P14 PU
D6: L29 P32 PU
D3: L15 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L15 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L15 P25 Core
D6: L30 P15 PU
D6: L31 P33 PU
D3: L16 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L16 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L16 P26 Core
D6: L32 P16 PU
D6: L33 P34 PU
D3: L17 P-1 L2Cache Cache{size=1048576,depth=2,linesize=64,associativity=16,type=Unified}
D4: L17 P-1 L1Cache Cache{size=32768,depth=1,linesize=64,associativity=8,type=Data}
D5: L17 P27 Core
D6: L34 P17 PU
D6: L35 P35 PU
Seems that everything pertaining to the cache is missing. While the topology has the 8 cores, they're all nested directly inside the package rather than the caches. =/
Thanks for the response!
I made a Hwloc.jl
bug report here, and have pinned VectorizationBase.jl
to version 0.12 as a workaround.
Thanks for filling a report there, but the contributors at Hwloc.jl may forward you here: https://github.com/open-mpi/hwloc
Note that VectorizationBase 0.12 doesn't support Julia 1.6. Julia 1.6 won't be out until probably early February, so there's time to find a solutio. Or, failing that, I could implement a workaround.
Thanks for the heads-up. I appreciate your help!
Forgot to update to confirm that LoopVectorization's been using 64 as a default fallback for a while now.
Not sure if this is a VectorizationBase.jl, LoopVectorization.jl, or Hwloc.jl bug.
L1CACHE.linesize=nothing
on my system:This causes
LoopVectorization.jl
to fail to precompile.My system is the WSL2, the Windows Subsystem for Linux 2 running Ubuntu-20.04. The
/proc/cpuinfo
appears normal (happy to post on request), CPU isIntel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
.Some more info: