ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.78k stars 2.54k forks source link

Compiler build of tag.2033 fails under NetBSD-current #21788

Open ci4ic4 opened 1 week ago

ci4ic4 commented 1 week ago

Zig Version

0.14.0-dev.1954+2d888a8e6

Steps to Reproduce and Observed Behavior

NetBSD-current amd64 10.99.12 from 13th of October 2024, llvm 19.1.0. Builds up until dev.1954 worked as expected, including successful 'zig build test-behavior' ( as long as all zig cache files have been removed prior to running the test ).

Today's version 2033 reliably fails for me when zig2 tries to build the third stage, no matter whether I run a clean bootstrap or use the working dev.1954, as follows:

The failing command is:

/home/xci/src/zig/build/zig2 build --prefix /home/xci/src/zig/build/stage3 --zig-lib-dir /home/xci/src/zig/lib -Dversion-string=0.14.0-dev.2033+9ffee5abe -D
target=native -Dcpu=native -Denable-llvm -Dconfig_h=/home/xci/src/zig/build/config.h -Dno-langref -Doptimize=ReleaseFast -Dstrip

The gdb trace I can get is as follows: ...

Core was generated by `zig2'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000078683b in codegen_llvm_Builder_Global_Index_getReplacement__13852 ()
[Current thread is 1 (process 14833)]
(gdb) bt
#0  0x000000000078683b in codegen_llvm_Builder_Global_Index_getReplacement__13852 ()
#1  0x00000000007641a5 in codegen_llvm_Builder_Global_Index_unwrap__13828 ()
#2  0x0000000000b54c59 in codegen_llvm_Builder_Global_Index_name__13832 ()
#3  0x0000000000d25f9f in codegen_llvm_Builder_Global_Index_renameAssumeCapacity__13849 ()
#4  0x0000000000907714 in codegen_llvm_Builder_Global_Index_rename__13844 ()
#5  0x000000000092ddbc in codegen_llvm_Object_updateExportedGlobal__10749 ()
#6  0x00000000006fb9bf in codegen_llvm_Object_updateExports__10747 ()
#7  0x000000000091c390 in link_Elf_updateExports__3867 ()
#8  0x00000000006fabe9 in link_File_updateExports__3785 ()
#9  0x000000000058bd14 in Zcu_PerThread_processExportsInner__8769 ()
#10 0x0000000000497315 in Zcu_PerThread_processExports__8767 ()
#11 0x00000000004a587c in Compilation_update__4113 ()
#12 0x0000000000871e76 in Compilation_updateSubCompilation__4193 ()
#13 0x00000000008b897b in Compilation_buildOutputFromZig__4194 ()
#14 0x0000000000f9e713 in Compilation_buildRt__4159 ()
#15 0x0000000000bfe614 in Thread_WaitGroup_spawnManager__anon_103715_Manager_run__64115 ()
#16 0x0000000001372131 in Thread_callFn__anon_289893__87297 ()
#17 0x0000000000f9e648 in Thread_PosixThreadImpl_spawn__anon_234648_Instance_entryFn__80665 ()
#18 0x00007d83cefea145 in ?? () from /usr/lib/libpthread.so.1
#19 0x00007d83ce8fe200 in ?? () from /usr/lib/libc.so.12
Backtrace stopped: Cannot access memory at address 0x7d83c8ffb000

I was able to build dev.2034 under Ubuntu aarch64, but I guess that is expected.

Expected Behavior

Compiler build to complete.

ci4ic4 commented 1 week ago

The plot thickens... Build 2034 completed without a problem under NetBSD-current aarch64. It fails consistently on different test rigs under the same level NetBSD-current amd64, though. In both cases the LLVM in use is 19.1.0.

ci4ic4 commented 1 week ago

I confirmed the problem using the latest llvm 19.1.2 from pkgsrc on the latest equivalent version of NetBSD-current and reported it on the NetBSD-current mailing list as well.

ci4ic4 commented 1 week ago

Repeating the process without the optimisations and using make instead of ninja gives similar but more verbose trace:

Thread 1 (process 18271):
#0  codegen_llvm_Builder_Global_Index_getReplacement__13852 (a0=2863311530, a1=0x720119f19020) at /home/xci/src/zig.new/build/zig2.c:648247
#1  0x00000000007661b5 in codegen_llvm_Builder_Global_Index_unwrap__13828 (a0=2863311530, a1=0x720119f19020) at /home/xci/src/zig.new/build/zig2.c:633445
#2  0x0000000000b5645e in codegen_llvm_Builder_Global_Index_name__13832 (a0=2863311530, a1=0x720119f19020) at /home/xci/src/zig.new/build/zig2.c:973383
#3  0x0000000000d27ca9 in codegen_llvm_Builder_Global_Index_renameAssumeCapacity__13849 (a0=2863311530, a1=2147484074, a2=0x720119f19020) at /home/xci/src/zig.new/build/zig2.c:1142934
#4  0x0000000000909804 in codegen_llvm_Builder_Global_Index_rename__13844 (a0=2863311530, a1=2147484074, a2=0x720119f19020) at /home/xci/src/zig.new/build/zig2.c:785703
#5  0x000000000092fea2 in codegen_llvm_Object_updateExportedGlobal__10749 (a0=0x720119f19010, a1=0x720119fe1b80, a2=2863311530, a3=...) at /home/xci/src/zig.new/build/zig2.c:800570
#6  0x00000000006fd909 in codegen_llvm_Object_updateExports__10747 (a0=0x720119f19010, a1=..., a2=..., a3=...) at /home/xci/src/zig.new/build/zig2.c:602456
#7  0x000000000091e476 in link_Elf_updateExports__3867 (a0=0x7201206a2410, a1=..., a2=..., a3=...) at /home/xci/src/zig.new/build/zig2.c:793223
#8  0x00000000006fcb33 in link_File_updateExports__3785 (a0=0x7201206a2540, a1=..., a2=..., a3=...) at /home/xci/src/zig.new/build/zig2.c:602139
#9  0x000000000058dfc2 in Zcu_PerThread_processExportsInner__8769 (a0=..., a1=0x72011aff63f0, a2=..., a3=...) at /home/xci/src/zig.new/build/zig2.c:482417
#10 0x0000000000499515 in Zcu_PerThread_processExports__8767 (a0=...) at /home/xci/src/zig.new/build/zig2.c:409055
#11 0x00000000004a7a7c in Compilation_update__4113 (a0=0x72012069aef0, a1=...) at /home/xci/src/zig.new/build/zig2.c:413088
#12 0x0000000000873f66 in Compilation_updateSubCompilation__4193 (a0=0x7201206d55a0, a1=0x72012069aef0, a2=14 '\016', a3=...) at /home/xci/src/zig.new/build/zig2.c:739926
#13 0x00000000008baa6b in Compilation_buildOutputFromZig__4194 (a0=0x7201206d55a0, a1=..., a2=1 '\001', a3=0x7201206d5ac0, a4=14 '\016', a5=...) at /home/xci/src/zig.new/build/zig2.c:758145
#14 0x0000000000fa0817 in Compilation_buildRt__4159 (a0=0x7201206d55a0, a1=..., a2=14 '\016', a3=1 '\001', a4=0x7201206d5ac0, a5=...) at /home/xci/src/zig.new/build/zig2.c:1389996
#15 0x0000000000c00247 in Thread_WaitGroup_spawnManager__anon_103569_Manager_run__64008 (a0=0x7201206d5728, a1=...) at /home/xci/src/zig.new/build/zig2.c:1041691
#16 0x0000000001374473 in Thread_callFn__anon_289814__87190 (a0=...) at /home/xci/src/zig.new/build/zig2.c:1722598
#17 0x0000000000fa074c in Thread_PosixThreadImpl_spawn__anon_234567_Instance_entryFn__80557 (a0=0x7201207308c0) at /home/xci/src/zig.new/build/zig2.c:1389978
#18 0x0000720121006145 in ?? () from /usr/lib/libpthread.so.1
#19 0x000072012091a200 in ?? () from /usr/lib/libc.so.12
Backtrace stopped: Cannot access memory at address 0x72011affb000

There are still 6 more threads in the trace , all parked.

There is obviously a difference in the thread implementation under aarch64 and amd64 and the latter is for some reason causing the second stage zig2 compiler to barf out.