JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.41k stars 5.45k forks source link

Target specific features #49653

Open kpamnany opened 1 year ago

kpamnany commented 1 year ago

Julia internally supports multi-versioning for its native code caches. Some users have expressed the need for writing target specific code.

This has repeatedly caused issues with multi-versioning since there is no way for the user to describe what target features a method requires and for multi-versioning to be pre-solved during inference.

Similarly have_fma and have_float16 are currently intrinsics to query feature flags, but rely on optimization to not run into issues.

Original issue

julia> versioninfo()
Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores

Encountered running PackageCompiler:

⣄ [02m:28s] PackageCompiler: compiling incremental system imageLLVM ERROR: Do not know how to split the result of this operator!
Full stack trace. ```julia ┌ Debug: running `/home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/julia --color=yes --startup-file=no '--cpu-target=generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)' --sysimage=/home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/lib/julia/sys.so --project=/home/kpamnany/ChunkedCSV.jl --output-o=/tmp/jl_wLGpMrS9oh.o /tmp/jl_Wnk7a1YX8J` └ @ PackageCompiler ~/.julia/dev/PackageCompiler/src/PackageCompiler.jl:419 ⢰ [02m:52s] PackageCompiler: compiling incremental system imageLLVM ERROR: Do not know how to split the result of this operator! signal (6): Aborted in expression starting at none:0 pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) raise at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) _ZN4llvm18report_fatal_errorERKNS_5TwineEb at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm18report_fatal_errorEPKcb at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16DAGTypeLegalizer17SplitVectorResultEPNS_6SDNodeEj at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16DAGTypeLegalizer3runEv at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm12SelectionDAG13LegalizeTypesEv at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16SelectionDAGISel17CodeGenAndEmitDAGEv at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16SelectionDAGISel20SelectAllBasicBlocksERKNS_8FunctionE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16SelectionDAGISel20runOnMachineFunctionERNS_15MachineFunctionE.part.899 at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN12_GLOBAL__N_115X86DAGToDAGISel20runOnMachineFunctionERN4llvm15MachineFunctionE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) ⣠ [02m:52s] PackageCompiler: compiling incremental system imageoperator() at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/aotcompile.cpp:580 [inlined] jl_dump_native_impl at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/aotcompile.cpp:592 jl_write_compiler_output at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/precompile.c:94 ijl_atexit_hook at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/init.c:207 jl_repl_entrypoint at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/jlapi.c:720 main at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/cli/loader_exe.c:59 unknown function (ip: 0x7efe5b69ad8f) __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) unknown function (ip: 0x401098) Allocations: 93901137 (Pool: 93850355; Big: 50782); GC: 63 ```

First thought to be an LLVM bug: https://github.com/llvm/llvm-project/issues/62569

Turns out to be a multi-versioning problem. Setting cpu_target to a single platform, e.g. "znver3", eliminates the problem.

This does not happen on v1.9.0-rc3. @vchuravy thinks that @pchintalapudi's rework of multi-versioning is what fixed this problem. Is there a specific PR that would have fixed this that we could backport maybe?

vchuravy commented 1 year ago

So my assumption is that one of the optimization passes before MV generated an intrinsic that is correct for the host platform, then MV comes in an forces the feature flags for generic. On which the intrinsic is illegal

pchintalapudi commented 1 year ago

1.9.0-rc3 doesn't have my multiversioning changeset, that's only on master. Bisecting would probably be more useful here.

vchuravy commented 1 year ago

It could also be the LLVM 14 upgrade for 1.9 simply not generating the offending instruction anymore

giordano commented 1 year ago

Duplicate of #47176?

kpamnany commented 1 year ago

Since this problem is fixed on 1.9.0 and will not be fixed for 1.8, maybe it should be closed?

vchuravy commented 1 year ago

I am not sure if the underlying issue is actually fixed and we just "magicked" our way around it.

jakobnissen commented 1 year ago

Same issue (probably) seen on 1.9.0 and 1.9.1: https://github.com/BioJulia/FASTX.jl/issues/101

kpamnany commented 1 year ago

Yes, we ran into this on 1.9.2 so the LLVM upgrade was not the solution.

vchuravy commented 1 year ago

I need an MWE to actually debug this issue. Otherwise I can only speculate.

The FastX.jl issue is solved? @jakobnissen so it was a different issue?

jakobnissen commented 1 year ago

@vchuravy yes sorry should have updated here. The problem for FASTX was that I was using x86 intrinsics based on the user's CPU, but did not take into account that the user could run with JULIA_CPU_TARGET set or make a PackageCompiler app, such that the intrinsic was unavailable at runtime

vchuravy commented 9 months ago

Yes, we ran into this on 1.9.2 so the LLVM upgrade was not the solution.

Kiran was the RAI issue fixed by https://github.com/Drvi/ChunkedCSV.jl/pull/24?

kpamnany commented 9 months ago

Yes, avoiding clmul ducked the problem.

vchuravy commented 9 months ago

Okay this means it's not a Julia bug, but rather a missing capability.