Open kpamnany opened 1 year ago
So my assumption is that one of the optimization passes before MV generated an intrinsic that is correct for the host platform, then MV comes in an forces the feature flags for generic
. On which the intrinsic is illegal
1.9.0-rc3 doesn't have my multiversioning changeset, that's only on master. Bisecting would probably be more useful here.
It could also be the LLVM 14 upgrade for 1.9 simply not generating the offending instruction anymore
Duplicate of #47176?
Since this problem is fixed on 1.9.0 and will not be fixed for 1.8, maybe it should be closed?
I am not sure if the underlying issue is actually fixed and we just "magicked" our way around it.
Same issue (probably) seen on 1.9.0 and 1.9.1: https://github.com/BioJulia/FASTX.jl/issues/101
Yes, we ran into this on 1.9.2 so the LLVM upgrade was not the solution.
I need an MWE to actually debug this issue. Otherwise I can only speculate.
The FastX.jl issue is solved? @jakobnissen so it was a different issue?
@vchuravy yes sorry should have updated here. The problem for FASTX was that I was using x86 intrinsics based on the user's CPU, but did not take into account that the user could run with JULIA_CPU_TARGET set or make a PackageCompiler app, such that the intrinsic was unavailable at runtime
Yes, we ran into this on 1.9.2 so the LLVM upgrade was not the solution.
Kiran was the RAI issue fixed by https://github.com/Drvi/ChunkedCSV.jl/pull/24?
Yes, avoiding clmul
ducked the problem.
Okay this means it's not a Julia bug, but rather a missing capability.
Julia internally supports multi-versioning for its native code caches. Some users have expressed the need for writing target specific code.
This has repeatedly caused issues with multi-versioning since there is no way for the user to describe what target features a method requires and for multi-versioning to be pre-solved during inference.
Similarly
have_fma
andhave_float16
are currently intrinsics to query feature flags, but rely on optimization to not run into issues.Original issue
Encountered running
PackageCompiler
:Full stack trace.
```julia ┌ Debug: running `/home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/julia --color=yes --startup-file=no '--cpu-target=generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)' --sysimage=/home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/lib/julia/sys.so --project=/home/kpamnany/ChunkedCSV.jl --output-o=/tmp/jl_wLGpMrS9oh.o /tmp/jl_Wnk7a1YX8J` └ @ PackageCompiler ~/.julia/dev/PackageCompiler/src/PackageCompiler.jl:419 ⢰ [02m:52s] PackageCompiler: compiling incremental system imageLLVM ERROR: Do not know how to split the result of this operator! signal (6): Aborted in expression starting at none:0 pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) raise at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) _ZN4llvm18report_fatal_errorERKNS_5TwineEb at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm18report_fatal_errorEPKcb at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16DAGTypeLegalizer17SplitVectorResultEPNS_6SDNodeEj at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16DAGTypeLegalizer3runEv at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm12SelectionDAG13LegalizeTypesEv at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16SelectionDAGISel17CodeGenAndEmitDAGEv at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16SelectionDAGISel20SelectAllBasicBlocksERKNS_8FunctionE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm16SelectionDAGISel20runOnMachineFunctionERNS_15MachineFunctionE.part.899 at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN12_GLOBAL__N_115X86DAGToDAGISel20runOnMachineFunctionERN4llvm15MachineFunctionE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) _ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /home/kpamnany/.julia/juliaup/julia-1.8.5+0.x64.linux.gnu/bin/../lib/julia/libLLVM-13jl.so (unknown line) ⣠ [02m:52s] PackageCompiler: compiling incremental system imageoperator() at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/aotcompile.cpp:580 [inlined] jl_dump_native_impl at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/aotcompile.cpp:592 jl_write_compiler_output at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/precompile.c:94 ijl_atexit_hook at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/init.c:207 jl_repl_entrypoint at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/jlapi.c:720 main at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/cli/loader_exe.c:59 unknown function (ip: 0x7efe5b69ad8f) __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) unknown function (ip: 0x401098) Allocations: 93901137 (Pool: 93850355; Big: 50782); GC: 63 ```First thought to be an LLVM bug: https://github.com/llvm/llvm-project/issues/62569
Turns out to be a multi-versioning problem. Setting
cpu_target
to a single platform, e.g."znver3"
, eliminates the problem.This does not happen on
v1.9.0-rc3
. @vchuravy thinks that @pchintalapudi's rework of multi-versioning is what fixed this problem. Is there a specific PR that would have fixed this that we could backport maybe?