Open pfitzseb opened 2 years ago
Importantly, goldmont
doesn't have AVX, AVX2, or (to Oscar Smith's horror) FMA.
What does
julia> unsafe_string(@ccall Base.libllvm_path().LLVMGetHostCPUName()::Cstring)
"haswell"
return for you?
julia> unsafe_string(@ccall Base.libllvm_path().LLVMGetHostCPUName()::Cstring)
"goldmont"
That may be a problem, but that comes from LLVM. Sys.CPU_NAME
is set basically by https://github.com/JuliaLang/julia/blob/28f58e77f9e1e60b96c707daf1f53a071e564ae8/src/processor_x86.cpp#L701-L708 I don't know if you go through get_host_cpu()
or jl_get_cpu_name_llvm()
, but apparently also LLVM is getting it wrong.
We simply don't have alder lake cpus on processor_x86.cpp
With my pr
julia> versioninfo()
Julia Version 1.9.0-DEV.227
Commit 32b1305c78* (2022-03-21 20:33 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 16 × 12th Gen Intel(R) Core(TM) i5-12600K
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, alderlake)
Threads: 1 on 16 virtual cores
What does ccall(:jl_dump_host_cpu, Cvoid, ())
give you?
Importantly, goldmont doesn't have AVX, AVX2, or (to Oscar Smith's horror) FMA.
This really shouldn't matter....
julia> ccall(:jl_dump_host_cpu, Cvoid, ())
CPU: generic
Features: sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, rdrnd, fsgsbase, bmi, avx2, bmi2, rdseed, adx, clflushopt, clwb, sha, waitpkg, shstk, gfni, vaes, vpclmulqdq, rdpid, movdiri, movdir64b, serialize, sahf, lzcnt, prfchw, xsaveopt, xsavec, xsaves, ptwrite
This is the name and features we pass to LLVM and as you can see AVX2 and FMA are both enabled. The goldmont is just a bad guess by LLVM for the name/scheduling model to use since we couldn't detect one.
The only things that weren't there that are after the PR are avxvnni
and pconfig
pconfig detection shouldn't be changed by the PR. The feature that LLVM claim the processor should have but isn't detected here but should have been are
pconfig
pku
cldemote
I'm not sure why LLVM claim the processor has pku and cldemote.
I'm not sure where the pconfig comes from, maybe we are now passing the correct name to llvm and it returns pconfig too. I misse pku but it is there. Cldemote isn't there however.
julia> ccall(:jl_dump_host_cpu, Cvoid, ())
CPU: alderlake
Features: sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, rdrnd, fsgsbase, bmi, avx2, bmi2, rdseed, adx, clflushopt, clwb, sha, pku, waitpkg, shstk, gfni, vaes, vpclmulqdq, rdpid, movdiri, movdir64b, serialize, pconfig, sahf, lzcnt, prfchw, xsaveopt, xsavec, xsaves, avxvnni, ptwrite
maybe we are now passing the correct name to llvm and it returns pconfig too
No this printing is completely independent of LLVM. This is what we detect and pass to llvm. LLVM feature and name detection are not involved here (they are in codegen but not at this stage). Are you sure that pconfig
, pku
aren't there without your change?
Cldemote isn't there however
This is quite sad... I've actually been waiting for this since I hope it can help with pushing large amount of data between thread (at least I want to see if there's an effect on that)....