Native BFloat16 support not working on AMD EPYC 9554

jonas-schulze commented 3 months ago

As of #51470 and https://github.com/JuliaMath/BFloat16s.jl/pull/51, I was hoping that Julia may natively support BF16 on the CPU. I did some smoke testing on an CPU with avx512_bf16 support (according to lscpu) but observed some strange failure modes:

No output while one core sits at 100% load (see e.g. @code_llvm A+A below)
Segfault (see e.g. @code_llvm A*A below)
Stackoverflow (see https://github.com/JuliaMath/BFloat16s.jl/issues/68)

In https://github.com/JuliaMath/BFloat16s.jl/issues/68 I tried executing some code on v1.11.0-alpha2, while below I merely try to generate the LLVM IR but on the current nightly (available from juliaup).

$ julia +nightly
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.12.0-DEV.325 (2024-04-10)
 _/ |\__'_|_|_|\__'_|  |  Commit e9a24d4cee4 (0 days old master)
|__/                   |

(julia-bfloat16) pkg> activate --temp
  Activating new project at `/tmp/jl_PVz7wf`

(jl_PVz7wf) pkg> add BFloat16s@0.5
   Resolving package versions...
    Updating `/tmp/jl_PVz7wf/Project.toml`
  [ab4f0b2a] + BFloat16s v0.5.0
    Updating `/tmp/jl_PVz7wf/Manifest.toml`
  [ab4f0b2a] + BFloat16s v0.5.0
  [56f22d72] + Artifacts v1.11.0
  [2a0f44e3] + Base64 v1.11.0
  [b77e0a4c] + InteractiveUtils v1.11.0
  [8f399da3] + Libdl v1.11.0
  [37e2e46d] + LinearAlgebra v1.11.0
  [56ddb016] + Logging v1.11.0
  [d6f4376e] + Markdown v1.11.0
  [de0858da] + Printf v1.11.0
  [9a3f8284] + Random v1.11.0
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization v1.11.0
  [f489334b] + StyledStrings v1.11.0
  [8dfed614] + Test v1.11.0
  [4ec0a83e] + Unicode v1.11.0
  [e66e0078] + CompilerSupportLibraries_jll v1.1.1+0
  [4536629a] + OpenBLAS_jll v0.3.26+2
  [8e850b90] + libblastrampoline_jll v5.8.0+1

julia> using BFloat16s

julia> A = ones(BFloat16, 10, 10);

julia> @code_llvm A+A
^C

``` SYSTEM (REPL): showing an error caused an error ERROR: InterruptException: Stacktrace: [1] lookup(pointer::Ptr{Nothing}) @ Base.StackTraces ./stacktraces.jl:109 [2] stacktrace(trace::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}, c_funcs::Bool) @ Base.StackTraces ./stacktraces.jl:176 [3] stacktrace @ ./stacktraces.jl:174 [inlined] [4] scrub_repl_backtrace(bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}) @ Base ./client.jl:96 [5] scrub_repl_backtrace(stack::Base.ExceptionStack) @ Base ./client.jl:103 [6] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{…}) @ REPL ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/REPL/src/REPL.jl:418 [7] (::REPL.var"#70#71"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any) @ REPL ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/REPL/src/REPL.jl:394 [8] with_repl_linfo(f::Any, repl::REPL.LineEditREPL) @ REPL ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/REPL/src/REPL.jl:679 [9] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool) @ REPL ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/REPL/src/REPL.jl:392 [10] (::REPL.var"#do_respond#93"{…})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool) @ REPL ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/REPL/src/REPL.jl:1021 [11] #invokelatest#2 @ ./essentials.jl:1030 [inlined] [12] invokelatest @ ./essentials.jl:1027 [inlined] [13] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState) @ REPL.LineEdit ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:2747 [14] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef) @ REPL ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/REPL/src/REPL.jl:1446 [15] (::REPL.var"#75#81"{REPL.LineEditREPL, REPL.REPLBackendRef})() @ REPL ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/REPL/src/REPL.jl:496 Some type information was truncated. Use `show(err)` to see complete types. ```

julia> @code_llvm A*A

[83812] signal 11 (1): Segmentation fault
in expression starting at REPL[6]:1

``` jl_datatype_layout at /cache/build/tester-amdci4-8/julialang/julia-master/src/julia.h:1363 [inlined] gc_mark_outrefs at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:2778 [inlined] gc_mark_loop_serial_ at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:2950 gc_mark_loop_serial at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:2973 gc_mark_loop at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:3150 [inlined] _jl_gc_collect at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:3539 ijl_gc_collect at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:3918 maybe_collect at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:922 [inlined] jl_gc_pool_alloc_inner at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:1319 jl_gc_pool_alloc_noinline at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:1386 [inlined] jl_gc_alloc_ at /cache/build/tester-amdci4-8/julialang/julia-master/src/julia_internal.h:505 [inlined] jl_gc_alloc at /cache/build/tester-amdci4-8/julialang/julia-master/src/gc.c:3971 ijl_new_struct at /cache/build/tester-amdci4-8/julialang/julia-master/src/datatype.c:1537 inst_type_w_ at /cache/build/tester-amdci4-8/julialang/julia-master/src/jltypes.c:2389 inst_type_w_ at /cache/build/tester-amdci4-8/julialang/julia-master/src/jltypes.c:2382 jl_rename_unionall at /cache/build/tester-amdci4-8/julialang/julia-master/src/jltypes.c:1496 unalias_unionall at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:906 subtype_unionall at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:917 subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1420 subtype_unionall at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:932 subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1417 subtype_unionall at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:932 subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1417 exists_subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1648 [inlined] _forall_exists_subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1679 forall_exists_subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1693 [inlined] ijl_subtype_env at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:2143 subtype_tuple_tail at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1232 [inlined] subtype_tuple at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1314 [inlined] subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1457 exists_subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1648 [inlined] _forall_exists_subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1679 forall_exists_subtype at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:1693 [inlined] ijl_subtype_env at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:2143 jl_type_intersection_env_s at /cache/build/tester-amdci4-8/julialang/julia-master/src/subtype.c:4330 jl_typemap_intersection_node_visitor at /cache/build/tester-amdci4-8/julialang/julia-master/src/typemap.c:543 jl_typemap_intersection_visitor at /cache/build/tester-amdci4-8/julialang/julia-master/src/typemap.c:812 jl_typemap_intersection_visitor at /cache/build/tester-amdci4-8/julialang/julia-master/src/typemap.c:793 ml_mtable_visitor at /cache/build/tester-amdci4-8/julialang/julia-master/src/gf.c:3398 [inlined] ml_matches at /cache/build/tester-amdci4-8/julialang/julia-master/src/gf.c:3772 ml_matches at /cache/build/tester-amdci4-8/julialang/julia-master/src/gf.c:3687 [inlined] ijl_matching_methods at /cache/build/tester-amdci4-8/julialang/julia-master/src/gf.c:2309 _methods_by_ftype at ./reflection.jl:1176 _findall at ./compiler/methodtable.jl:95 [inlined] #findall#311 at ./compiler/methodtable.jl:68 [inlined] findall at ./compiler/methodtable.jl:68 [inlined] #findall#313 at ./compiler/methodtable.jl:109 findall at ./compiler/methodtable.jl:100 [inlined] find_simple_method_matches at ./compiler/abstractinterpretation.jl:320 #find_method_matches#434 at ./compiler/abstractinterpretation.jl:263 find_method_matches at ./compiler/abstractinterpretation.jl:257 [inlined] abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:25 abstract_call_unknown at ./compiler/abstractinterpretation.jl:2272 abstract_call at ./compiler/abstractinterpretation.jl:2282 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3054 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 jfptr_typeinf_37000.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) typeinf_edge at ./compiler/typeinfer.jl:865 abstract_call_method at ./compiler/abstractinterpretation.jl:655 abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:20 abstract_call_known at ./compiler/abstractinterpretation.jl:2148 abstract_call at ./compiler/abstractinterpretation.jl:2285 abstract_call at ./compiler/abstractinterpretation.jl:2278 abstract_call at ./compiler/abstractinterpretation.jl:2423 abstract_eval_call at ./compiler/abstractinterpretation.jl:2438 abstract_eval_statement_expr at ./compiler/abstractinterpretation.jl:2454 abstract_eval_statement at ./compiler/abstractinterpretation.jl:2767 abstract_eval_basic_statement at ./compiler/abstractinterpretation.jl:3078 typeinf_local at ./compiler/abstractinterpretation.jl:3255 typeinf_nocycle at ./compiler/abstractinterpretation.jl:3414 _typeinf at ./compiler/typeinfer.jl:237 typeinf at ./compiler/typeinfer.jl:216 typeinf_ext at ./compiler/typeinfer.jl:1123 typeinf_ext_toplevel at ./compiler/typeinfer.jl:1181 [inlined] typeinf_ext_toplevel at ./compiler/typeinfer.jl:1179 jfptr_typeinf_ext_toplevel_37243.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) jl_apply at /cache/build/tester-amdci4-8/julialang/julia-master/src/julia.h:2184 [inlined] jl_type_infer at /cache/build/tester-amdci4-8/julialang/julia-master/src/gf.c:393 jl_get_llvmf_defn_impl at /cache/build/tester-amdci4-8/julialang/julia-master/src/aotcompile.cpp:1994 _dump_function_llvm at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:272 _dump_function at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:233 #code_llvm#39 at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:296 code_llvm at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:293 jl_apply at /cache/build/tester-amdci4-8/julialang/julia-master/src/julia.h:2184 [inlined] do_apply at /cache/build/tester-amdci4-8/julialang/julia-master/src/builtins.c:831 #code_llvm#40 at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:303 code_llvm at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:303 unknown function (ip: 0x7fcf17f9a056) jl_apply at /cache/build/tester-amdci4-8/julialang/julia-master/src/julia.h:2184 [inlined] do_call at /cache/build/tester-amdci4-8/julialang/julia-master/src/interpreter.c:126 eval_value at /cache/build/tester-amdci4-8/julialang/julia-master/src/interpreter.c:223 eval_stmt_value at /cache/build/tester-amdci4-8/julialang/julia-master/src/interpreter.c:174 [inlined] eval_body at /cache/build/tester-amdci4-8/julialang/julia-master/src/interpreter.c:659 jl_interpret_toplevel_thunk at /cache/build/tester-amdci4-8/julialang/julia-master/src/interpreter.c:829 jl_toplevel_eval_flex at /cache/build/tester-amdci4-8/julialang/julia-master/src/toplevel.c:953 __repl_entry_eval_expanded_with_loc at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:220 toplevel_eval_with_hooks at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:227 toplevel_eval_with_hooks at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:231 toplevel_eval_with_hooks at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:224 [inlined] eval_user_input at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:249 repl_backend_loop at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:358 #start_repl_backend#59 at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:343 start_repl_backend at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:340 #run_repl#72 at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:499 run_repl at /cache/build/tester-amdci4-8/julialang/julia-master/usr/share/julia/stdlib/v1.12/REPL/src/REPL.jl:485 jfptr_run_repl_12415 at /home/jschulze/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_szYnz.so (unknown line) #1154 at ./client.jl:448 jfptr_YY.1154_17031 at /home/jschulze/.julia/juliaup/julia-nightly/share/julia/compiled/v1.12/REPL/u0gqU_szYnz.so (unknown line) jl_apply at /cache/build/tester-amdci4-8/julialang/julia-master/src/julia.h:2184 [inlined] jl_f__call_latest at /cache/build/tester-amdci4-8/julialang/julia-master/src/builtins.c:875 #invokelatest#2 at ./essentials.jl:1030 [inlined] invokelatest at ./essentials.jl:1027 [inlined] run_main_repl at ./client.jl:432 repl_main at ./client.jl:569 [inlined] _start at ./client.jl:543 jfptr__start_69488.1 at /home/jschulze/.julia/juliaup/julia-nightly/lib/julia/sys.so (unknown line) jl_apply at /cache/build/tester-amdci4-8/julialang/julia-master/src/julia.h:2184 [inlined] true_main at /cache/build/tester-amdci4-8/julialang/julia-master/src/jlapi.c:900 jl_repl_entrypoint at /cache/build/tester-amdci4-8/julialang/julia-master/src/jlapi.c:1059 main at /cache/build/tester-amdci4-8/julialang/julia-master/cli/loader_exe.c:58 unknown function (ip: 0x7fd36bb1fd8f) __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) unknown function (ip: 0x4010b8) Allocations: 10616528 (Pool: 10615123; Big: 1405); GC: 9 Segmentation fault (core dumped) ```

giordano commented 3 months ago

For what is worth, I couldn't reproduce the issue on Nvidia Grace (ARM Neoverse V2, which has bf16 extension) on

julia> versioninfo()
Julia Version 1.12.0-DEV.325
Commit e9a24d4cee4 (2024-04-10 13:11 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (aarch64-linux-gnu)
  CPU: 72 × unknown
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, neoverse-v2)
Threads: 1 default, 0 interactive, 1 GC (on 72 virtual cores)

I know it's a different architecture, but just to say this isn't completely broken everywhere :slightly_smiling_face:

gbaraldi commented 3 months ago

It might be worth trying this out with assertions on. And on latest master.

Though be aware that native BFloat16 support requires LLVM 17 IIRC

jonas-schulze commented 2 months ago

I just tried an assert build of the current master and got a segfault:

$ ~/git/julia/julia 
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.12.0-DEV.334 (2024-04-12)
 _/ |\__'_|_|_|\__'_|  |  Commit 1ae41a2c0a (0 days old master)
|__/                   |

(v0.5.0) pkg> activate --temp
  Activating new project at `/tmp/jl_8a5off`

(jl_8a5off) pkg> add BFloat16s
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `/tmp/jl_8a5off/Project.toml`
  [ab4f0b2a] + BFloat16s v0.5.0
    Updating `/tmp/jl_8a5off/Manifest.toml`
  [ab4f0b2a] + BFloat16s v0.5.0
  [56f22d72] + Artifacts v1.11.0
  [2a0f44e3] + Base64 v1.11.0
  [b77e0a4c] + InteractiveUtils v1.11.0
  [8f399da3] + Libdl v1.11.0
  [37e2e46d] + LinearAlgebra v1.11.0
  [56ddb016] + Logging v1.11.0
  [d6f4376e] + Markdown v1.11.0
  [de0858da] + Printf v1.11.0
  [9a3f8284] + Random v1.11.0
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization v1.11.0
  [f489334b] + StyledStrings v1.11.0
  [8dfed614] + Test v1.11.0
  [4ec0a83e] + Unicode v1.11.0
  [e66e0078] + CompilerSupportLibraries_jll v1.1.1+0
  [4536629a] + OpenBLAS_jll v0.3.26+2
  [8e850b90] + libblastrampoline_jll v5.8.0+1
Precompiling all packages...
  1 dependency successfully precompiled in 2 seconds. 9 already precompiled.

julia> using BFloat16s

julia> A = ones(BFloat16, 10, 10);

julia> @code_llvm A+A
Segmentation fault (core dumped)

Though be aware that native BFloat16 support requires LLVM 17 IIRC

This is only for ARM. For x86 it should be LLVM 15:

https://github.com/JuliaMath/BFloat16s.jl/pull/51/files#diff-ebc5bbf74fbb5dbdec7b36b8590f4cdd1509b07982ca191a2dea4ac8bb05c8a9R39-R42

jonas-schulze commented 2 months ago

For what is worth, I couldn't reproduce the issue on Nvidia Grace (ARM Neoverse V2, which has bf16 extension) on

Given the link in my previous comment, the configuration you tried, @giordano, had BFloat16s.llvm_arithmetic == false. That is, I suspect that it did not generate fadd bfloat but only emulated the computations; see https://github.com/JuliaMath/BFloat16s.jl/issues/68#issuecomment-2025890696. Would you mind to verify?

Using the assert build on AMD EPYC 9554 I still don't see fadd bfloat ... :slightly_frowning_face:

julia> a = one(BFloat16)
BFloat16(1.0)

julia> @code_llvm a+a
; Function Signature: +(Core.BFloat16, Core.BFloat16)
;  @ /home/jschulze/.julia/packages/BFloat16s/u3WQc/src/bfloat16.jl:225 within `+`
define bfloat @"julia_+_4671"(bfloat %"x::BFloat16", bfloat %"y::BFloat16") #0 {
top:
  %0 = fpext bfloat %"x::BFloat16" to float
  %1 = fpext bfloat %"y::BFloat16" to float
  %2 = fadd float %0, %1
  %3 = fptrunc float %2 to bfloat
  ret bfloat %3
}

julia> BFloat16s.llvm_arithmetic
true

giordano commented 2 months ago

Yeah, I mentioned yesterday on Slack that on aarch64 I don't get bfloat types:

julia> code_llvm(+, NTuple{2,BFloat16}; debuginfo=:none)

; Function Signature: +(BFloat16s.BFloat16, BFloat16s.BFloat16)
define i16 @"julia_+_6962"(i16 zeroext %"x::BFloat16", i16 zeroext %"y::BFloat16") #0 {
top:
  %0 = zext i16 %"x::BFloat16" to i32
  %1 = shl nuw i32 %0, 16
  %bitcast_coercion = bitcast i32 %1 to float
  %2 = zext i16 %"y::BFloat16" to i32
  %3 = shl nuw i32 %2, 16
  %bitcast_coercion7 = bitcast i32 %3 to float
  %4 = fadd float %bitcast_coercion, %bitcast_coercion7
  %5 = fcmp ord float %4, 0.000000e+00
  br i1 %5, label %L13, label %L30

L13:                                              ; preds = %top
  %bitcast_coercion9 = bitcast float %4 to i32
  %6 = lshr i32 %bitcast_coercion9, 16
  %7 = and i32 %6, 1
  %narrow = add nuw nsw i32 %7, 32767
  %8 = zext i32 %narrow to i64
  %9 = zext i32 %bitcast_coercion9 to i64
  %10 = add nuw nsw i64 %8, %9
  %11 = lshr i64 %10, 16
  %12 = trunc i64 %11 to i16
  br label %L30

L30:                                              ; preds = %L13, %top
  %value_phi = phi i16 [ %12, %L13 ], [ 32704, %top ]
  ret i16 %value_phi
}

gbaraldi commented 2 months ago

I think this only gets enabled on LLVM 17. We have some annoying code because of the couple ABI breaks that have happened

jonas-schulze commented 2 months ago

There is an infinite recursion within LLVM in between

which leads to a StackOverflow. I suspect the 100% CPU utilization (one core) simply were due to Julia trying to prepare the backtrace, since ^C led to showing an error caused an error.

I briefly checked above LLVM code, but couldn't make much sense of it (unless the get*Overhead were inlined into BasicTTIImplBase<T>::getShuffleCost and therefore didn't show up in the backtrace below). It may also be the case that the files I linked to are not the correct ones; I didn't fully understand the Makefile.

How should we proceed with this bug? Can I hand this off to you / one of the core developers? I'm not an LLVM expert and don't have much time to dig into this at the moment, unfortunately.

GDB Backtrace (truncated)

``` #0 0x00007ffff3b241ea in llvm::X86TTIImpl::getVectorInstrCost(unsigned int, llvm::Type*, llvm::TargetTransformInfo::TargetCostKind, unsigned int, llvm::Value*, llvm::Value*) [clone .constprop.1] () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so === Repeat === #1 0x00007ffff3b2a751 in llvm::BasicTTIImplBase::getShuffleCost(llvm::TargetTransformInfo::ShuffleKind, llvm::VectorType*, llvm::ArrayRef, llvm::TargetTransformInfo::TargetCostKind, int, llvm::VectorType*, llvm::ArrayRef) [clone .isra.0] () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #2 0x00007ffff3b23444 in llvm::X86TTIImpl::getShuffleCost(llvm::TargetTransformInfo::ShuffleKind, llvm::VectorType*, llvm::ArrayRef, llvm::TargetTransformInfo::TargetCostKind, int, llvm::VectorType*, llvm::ArrayRef) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #3 0x00007ffff3b24509 in llvm::X86TTIImpl::getVectorInstrCost(unsigned int, llvm::Type*, llvm::TargetTransformInfo::TargetCostKind, unsigned int, llvm::Value*, llvm::Value*) [clone .constprop.1] () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so === Until === #34819 0x00007ffff3b2a820 in llvm::BasicTTIImplBase::getShuffleCost(llvm::TargetTransformInfo::ShuffleKind, llvm::VectorType*, llvm::ArrayRef, llvm::TargetTransformInfo::TargetCostKind, int, llvm::VectorType*, llvm::ArrayRef) [clone .isra.0] () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34820 0x00007ffff3b23444 in llvm::X86TTIImpl::getShuffleCost(llvm::TargetTransformInfo::ShuffleKind, llvm::VectorType*, llvm::ArrayRef, llvm::TargetTransformInfo::TargetCostKind, int, llvm::VectorType*, llvm::ArrayRef) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34821 0x00007ffff27783af in llvm::TargetTransformInfo::getShuffleCost(llvm::TargetTransformInfo::ShuffleKind, llvm::VectorType*, llvm::ArrayRef, llvm::TargetTransformInfo::TargetCostKind, int, llvm::VectorType*, llvm::ArrayRef) const () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34822 0x00007ffff228ecc1 in llvm::LoopVectorizationCostModel::getUniformMemOpCost(llvm::Instruction*, llvm::ElementCount) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34823 0x00007ffff22b4093 in llvm::LoopVectorizationCostModel::setCostBasedWideningDecision(llvm::ElementCount) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34824 0x00007ffff22d73ed in llvm::LoopVectorizationPlanner::plan(llvm::ElementCount, unsigned int) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34825 0x00007ffff22d8c36 in llvm::LoopVectorizePass::processLoop(llvm::Loop*) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34826 0x00007ffff22dc0e5 in llvm::LoopVectorizePass::runImpl(llvm::Function&, llvm::ScalarEvolution&, llvm::LoopInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::BlockFrequencyInfo&, llvm::TargetLibraryInfo*, llvm::DemandedBits&, llvm::AssumptionCache&, llvm::LoopAccessInfoManager&, llvm::OptimizationRemarkEmitter&, llvm::ProfileSummaryInfo*) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34827 0x00007ffff22dd2f5 in llvm::LoopVectorizePass::run(llvm::Function&, llvm::AnalysisManager&) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34828 0x00007ffff6851650 in llvm::detail::PassModel>::run(llvm::Function&, llvm::AnalysisManager&) (this=0x5555561e2190, IR=..., AM=...) at /home/jschulze/git/julia/usr/include/llvm/IR/PassManagerInternal.h:89 #34829 0x00007ffff0b81006 in llvm::PassManager>::run(llvm::Function&, llvm::AnalysisManager&) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34830 0x00007ffff68537f0 in llvm::detail::PassModel>, llvm::PreservedAnalyses, llvm::AnalysisManager>::run(llvm::Function&, llvm::AnalysisManager&) (this=0x555555d32360, IR=..., AM=...) at /home/jschulze/git/julia/usr/include/llvm/IR/PassManagerInternal.h:89 #34831 0x00007ffff0b7fd9f in llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager&) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34832 0x00007ffff6853f54 in llvm::detail::PassModel>::run(llvm::Module&, llvm::AnalysisManager&) (this=0x555555b85c00, IR=..., AM=...) at /home/jschulze/git/julia/usr/include/llvm/IR/PassManagerInternal.h:89 #34833 0x00007ffff0b7e0a2 in llvm::PassManager>::run(llvm::Module&, llvm::AnalysisManager&) () from /home/jschulze/git/julia/usr/bin/../lib/libLLVM-16jl.so #34834 0x00007ffff68169c1 in NewPM::run (this=0x7fffffffb020, M=...) at /home/jschulze/git/julia/src/pipeline.cpp:825 #34835 0x00007ffff67317f3 in jl_get_llvmf_defn_impl (dump=0x7fffeeab2f50, mi=0x7fffeedad310, world=26555, getwrapper=0 '\000', optimize=1 '\001', params=...) at /home/jschulze/git/julia/src/aotcompile.cpp:2061 #34836 0x00007ffba6e93ff5 in julia__dump_function_llvm_711 (mi=..., world=, wrapper=, strip_ir_metadata=, dump_module=, optimize=, debuginfo=, params=...) at /home/jschulze/git/julia/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:272 #34837 0x00007ffba6e9578e in julia__dump_function_630 (f=0x0, t=0x7fffe7016890 , native=, wrapper=, raw=, dump_module=, syntax=0x0, optimize=, debuginfo=0x7fffef039be8, binary=, params=...) at /home/jschulze/git/julia/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:233 #34838 0x00007ffba6ea0bc2 in julia_#code_llvm#39_359 (raw=, dump_module=, optimize=, debuginfo=, params=..., io=..., f=0x0, types=0x7ffff62fe8a8 ) at /home/jschulze/git/julia/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:296 #34839 0x00007ffba6ea0ecf in japi1_code_llvm_351 (io=..., f=0x7fffe3625e40 , types=0x7fffed0778d0) at /home/jschulze/git/julia/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:293 #34840 0x00007ffff7101aa2 in jl_fptr_args (f=0x7fffe170de50 , args=0x7fffffffbf58, nargs=3, m=0x7fffeeb98390) at /home/jschulze/git/julia/src/gf.c:2636 #34841 0x00007ffff7102f3b in _jl_invoke (F=0x7fffe170de50 , args=0x7fffffffbf58, nargs=3, mfunc=0x7fffecd2ded0, world=26555) at /home/jschulze/git/julia/src/gf.c:2994 #34842 0x00007ffff7103995 in ijl_apply_generic (F=0x7fffe170de50 , args=0x7fffffffbf58, nargs=3) at /home/jschulze/git/julia/src/gf.c:3171 #34843 0x00007ffff7115d96 in jl_apply (args=0x7fffffffbf50, nargs=4) at /home/jschulze/git/julia/src/julia.h:2184 #34844 0x00007ffff711a32f in do_apply (args=0x7fffffffc1f0, nargs=3, iterate=0x7fffe354abe0 ) at /home/jschulze/git/julia/src/builtins.c:831 #34845 0x00007ffff711a3c5 in jl_f__apply_iterate (F=0x0, args=0x7fffffffc1e8, nargs=4) at /home/jschulze/git/julia/src/builtins.c:839 #34846 0x00007ffba6e8b648 in japi1_#code_llvm#40_339 (kwargs=0x7fffe37e7970 , args...=) at /home/jschulze/git/julia/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:303 #34847 0x00007ffba6e8b7e1 in julia_code_llvm_331 (args...=) at /home/jschulze/git/julia/usr/share/julia/stdlib/v1.12/InteractiveUtils/src/codeview.jl:303 #34848 0x00007ffba6e8b847 in jfptr_code_llvm_332 () #34849 0x00007ffff7102f3b in _jl_invoke (F=0x7fffe170de50 , args=0x7fffffffc3d8, nargs=2, mfunc=0x7fffed0779d0, world=26555) at /home/jschulze/git/julia/src/gf.c:2994 #34850 0x00007ffff7103995 in ijl_apply_generic (F=0x7fffe170de50 , args=0x7fffffffc3d8, nargs=2) at /home/jschulze/git/julia/src/gf.c:3171 #34851 0x00007ffff712877a in jl_apply (args=0x7fffffffc3d0, nargs=3) at /home/jschulze/git/julia/src/julia.h:2184 #34852 0x00007ffff7128c40 in do_call (args=0x7fffed0b2db0, nargs=3, s=0x7fffffffc880) at /home/jschulze/git/julia/src/interpreter.c:126 #34853 0x00007ffff71294f9 in eval_value (e=0x7fffed0a39d0, s=0x7fffffffc880) at /home/jschulze/git/julia/src/interpreter.c:223 #34854 0x00007ffff7128ffe in eval_stmt_value (stmt=0x7fffed0a39d0, s=0x7fffffffc880) at /home/jschulze/git/julia/src/interpreter.c:174 #34855 0x00007ffff712bb03 in eval_body (stmts=0x7fffed0a3930, s=0x7fffffffc880, ip=3, toplevel=1) at /home/jschulze/git/julia/src/interpreter.c:659 #34856 0x00007ffff712cb1d in jl_interpret_toplevel_thunk (m=0x7fffe6bbb8d0 , src=0x7fffed42b110) at /home/jschulze/git/julia/src/interpreter.c:829 #34857 0x00007ffff7156ba5 in jl_toplevel_eval_flex (m=0x7fffe6bbb8d0 , e=0x7fffeef2d490, fast=1, expanded=0, toplevel_filename=0x7fffffffcd68, toplevel_lineno=0x7fffffffcd64) at /home/jschulze/git/julia/src/toplevel.c:953 #34858 0x00007ffff715663d in jl_toplevel_eval_flex (m=0x7fffe6bbb8d0 , e=0x7fffeef2d690, fast=1, expanded=0, toplevel_filename=0x7fffffffcd68, toplevel_lineno=0x7fffffffcd64) at /home/jschulze/git/julia/src/toplevel.c:893 #34859 0x00007ffff7156c33 in ijl_toplevel_eval (m=0x7fffe6bbb8d0 , v=0x7fffeef2d690) at /home/jschulze/git/julia/src/toplevel.c:964 #34860 0x00007ffff7156e97 in ijl_toplevel_eval_in (m=0x7fffe6bbb8d0 , ex=0x7fffeef2d690) at /home/jschulze/git/julia/src/toplevel.c:1006 #34861 0x00007fffe2c9bb25 in eval () at boot.jl:432 #34862 japi1_include_string_68398 (mapexpr=..., mod=0x7fffef03bad0, code=0x5c, filename=0x2e) at loading.jl:2591 #34863 0x00007ffff7101aa2 in jl_fptr_args (f=0x7fffe5e582b0 , args=0x7fffffffd2b0, nargs=4, m=0x7fffe5e592f0 ) at /home/jschulze/git/julia/src/gf.c:2636 #34864 0x00007ffff7102e56 in _jl_invoke (F=0x7fffe5e582b0 , args=0x7fffffffd2b0, nargs=4, mfunc=0x7fffe5e592a0 , world=26550) at /home/jschulze/git/julia/src/gf.c:2975 #34865 0x00007ffff7103995 in ijl_apply_generic (F=0x7fffe5e582b0 , args=0x7fffffffd2b0, nargs=4) at /home/jschulze/git/julia/src/gf.c:3171 #34866 0x00007fffe31be17c in japi1__include_68407 (mapexpr=0x7fffe3d98530 , mod=0x7fffef03bad0, _path=0xc) at loading.jl:2651 #34867 0x00007fffe2a633d7 in include () at Base.jl:559 #34868 julia_exec_options_69798 (opts=...) at client.jl:325 #34869 0x00007fffe309d79c in julia__start_69942 () at client.jl:533 #34870 0x00007fffe2da6109 in jfptr.start_69943 () from /home/jschulze/git/julia/usr/lib/julia/sys-debug.so #34871 0x00007ffff7102e56 in _jl_invoke (F=0x7fffe60a30c0 , args=0x7fffffffde58, nargs=0, mfunc=0x7fffe60a2d20 , world=26550) at /home/jschulze/git/julia/src/gf.c:2975 #34872 0x00007ffff7103995 in ijl_apply_generic (F=0x7fffe60a30c0 , args=0x7fffffffde58, nargs=0) at /home/jschulze/git/julia/src/gf.c:3171 #34873 0x00007ffff7195c4f in jl_apply (args=0x7fffffffde50, nargs=1) at /home/jschulze/git/julia/src/julia.h:2184 #34874 0x00007ffff7197ce0 in true_main (argc=1, argv=0x7fffffffe298) at /home/jschulze/git/julia/src/jlapi.c:900 #34875 0x00007ffff719836b in jl_repl_entrypoint (argc=1, argv=0x7fffffffe288) at /home/jschulze/git/julia/src/jlapi.c:1059 #34876 0x00007ffff7d77316 in jl_load_repl (argc=3, argv=0x7fffffffe288) at /home/jschulze/git/julia/cli/loader_lib.c:569 #34877 0x00005555555551b9 in main (argc=3, argv=0x7fffffffe288) at /home/jschulze/git/julia/cli/loader_exe.c:58 ```

jonas-schulze commented 1 month ago

The problem persists on the current nightly, Version 1.12.0-DEV.629 (2024-05-30).

vchuravy commented 1 month ago

The steps to this would be that someone finds a standalone llvm IR reproducer and then we can file this with upstream.

jonas-schulze commented 1 month ago

But how if it's @code_llvm ... that fails, not its execution?

vchuravy commented 1 month ago

You would use JULIA_LLVM_ARGS="--print-before=LoopVectorize" to get the IR before the vectorizer runs and then verify that opt -vectorize ir.ll also hangs.

JuliaLang / julia

Native BFloat16 support not working on AMD EPYC 9554 #54025