ClangBuiltLinux / linux

Linux kernel source tree
Other
242 stars 14 forks source link

kernel built in NixOS failed to boot #1983

Open yshui opened 9 months ago

yshui commented 9 months ago

There are a couple of small patches I had to apply to get the kernel to build under NixOS, the main problem in nixos/nixpkgs#242244

patch 1:

diff --git a/Makefile b/Makefile
index 18482f5ab5948..c97a6a2031d6d 100644
--- a/Makefile
+++ b/Makefile
@@ -997,9 +997,6 @@ NOSTDINC_FLAGS += -nostdinc
 # perform bounds checking.
 KBUILD_CFLAGS += $(call cc-option, -fstrict-flex-arrays=3)

-# disable invalid "can't wrap" optimizations for signed / pointers
-KBUILD_CFLAGS  += -fno-strict-overflow
-
 # Make sure -fstack-check isn't enabled (like gentoo apparently did)
 KBUILD_CFLAGS  += -fno-stack-check

This is because NixOS clang-wrapper already includes -fwrapv which does the same thing.

patch 2:

diff --git a/Makefile b/Makefile
index c97a6a2031d6d..26d80dc099eca 100644
--- a/Makefile
+++ b/Makefile
@@ -536,6 +536,9 @@ RUSTFLAGS_KERNEL =
 AFLAGS_KERNEL  =
 LDFLAGS_vmlinux =

+LDFLAGS_MODULE += --no-dynamic-linker
+LDFLAGS_vmlinux += --no-dynamic-linker
+
 # Use USERINCLUDE when you must reference the UAPI directories only.
 USERINCLUDE    := \
                -I$(srctree)/arch/$(SRCARCH)/include/uapi \
diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index f33e45ed14376..4f7f23ae75cea 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -102,7 +102,7 @@ $(obj)/zoffset.h: $(obj)/compressed/vmlinux FORCE
 AFLAGS_header.o += -I$(objtree)/$(obj)
 $(obj)/header.o: $(obj)/zoffset.h

-LDFLAGS_setup.elf      := -m elf_i386 -z noexecstack -T
+LDFLAGS_setup.elf      := --no-dynamic-linker -m elf_i386 -z noexecstack -T
 $(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE
        $(call if_changed,ld)

This is to deal with issue 242244

with these patches the kernel builds successfully, but when trying to boot it in qemu, I got:

(this is with -cpu max and no kvm)

[    0.025048][    T1] BUG: unable to handle page fault for address: ffff9e9b28736a3b
[    0.025578][    T1] #PF: supervisor write access in kernel mode
[    0.025698][    T1] #PF: error_code(0x0002) - not-present page
[    0.025698][    T1] PGD 39e01067 P4D 39e01067 PUD 0
[    0.025698][    T1] Oops: 0002 [#1] PREEMPT SMP NOPTI
[    0.025698][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.12-lqx1 #1-NixOS
[    0.025698][    T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[    0.025698][    T1] RIP: 0010:setup_real_mode+0x10c/0x1c0
[    0.025698][    T1] Code: 8b 58 0c 48 03 1d 34 c0 55 ff 0f 32 49 89 d6 49 c1 e6 20 49 09 c6 0f 1f 44 00 00 49 81 e6 ff fb ff ff 48 8d 43 10 48 8d 4b 18 <4c> 89 73 08 48 c7 03 60 00 e0 83 48 89 05 52 c0 18 00 8b 05 34 c0
[    0.025698][    T1] RSP: 0018:ffffaa4180013b88 EFLAGS: 00010202
[    0.025698][    T1] RAX: ffff9e9b28736a43 RBX: ffff9e9b28736a33 RCX: ffff9e9b28736a4b
[    0.025698][    T1] RDX: 0000000000000000 RSI: 0000000000000022 RDI: 0000000000002267
[    0.025698][    T1] RBP: ffffaa4180013ed0 R08: 0000000000000e17 R09: ffff9e9afdc32860
[    0.025698][    T1] R10: ffff9e9afeeddcc0 R11: ffffffff85791010 R12: ffffffff852105d8
[    0.025698][    T1] R13: 0000000000000000 R14: 0000000000200901 R15: 0000000000000000
[    0.025698][    T1] FS:  0000000000000000(0000) GS:ffff9e9afdc00000(0000) knlGS:0000000000000000
[    0.025698][    T1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.025698][    T1] CR2: ffff9e9b28736a3b CR3: 0000000039418000 CR4: 0000000000750ef0
[    0.025698][    T1] PKRU: 55555554
[    0.025698][    T1] Call Trace:
[    0.025698][    T1]  <TASK>
[    0.025698][    T1]  ? __die_body+0x68/0xb0
[    0.025698][    T1]  ? page_fault_oops+0x353/0x3d0
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? do_kern_addr_fault+0x9b/0xd0
[    0.025698][    T1]  ? exc_page_fault+0x82/0x150
[    0.025698][    T1]  ? asm_exc_page_fault+0x26/0x30
[    0.025698][    T1]  ? reserve_real_mode+0xa0/0xa0
[    0.025698][    T1]  ? setup_real_mode+0x10c/0x1c0
[    0.025698][    T1]  ? setup_real_mode+0x6c/0x1c0
[    0.025698][    T1]  ? set_real_mode_permissions+0xa0/0xa0
[    0.025698][    T1]  init_real_mode+0x18/0x30
[    0.025698][    T1]  do_init_real_mode+0x13/0x20
[    0.025698][    T1]  do_one_initcall+0x120/0x340
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? printk_get_next_message+0xfd/0x3c0
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? __switch_to+0x151/0x5b0
[    0.025698][    T1]  ? hrtimer_start_range_ns+0x26d/0x2f0
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? finish_task_switch+0xb9/0x300
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? __schedule+0x757/0xcc0
[    0.025698][    T1]  ? __kthread_create_on_node+0xb5/0x190
[    0.025698][    T1]  ? __kmem_cache_alloc_node+0x13e/0x1f0
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? schedule+0x6b/0xc0
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? schedule_timeout+0x32/0x180
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? wait_for_common+0x185/0x1c0
[    0.025698][    T1]  ? tasks_rcu_exit_srcu_stall+0x90/0x90
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? enqueue_task+0x7a/0x240
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? activate_task+0x13/0x140
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? ttwu_do_activate+0x49/0xf0
[    0.025698][    T1]  ? srso_alias_return_thunk+0x5/0x7f
[    0.025698][    T1]  ? try_to_wake_up+0x308/0x4a0
[    0.025698][    T1]  do_pre_smp_initcalls+0x31/0xa0
[    0.025698][    T1]  kernel_init_freeable+0xd3/0x160
[    0.025698][    T1]  ? rest_init+0xd0/0xd0
[    0.025698][    T1]  kernel_init+0x1a/0x1a0
[    0.025698][    T1]  ret_from_fork+0x37/0x50
[    0.025698][    T1]  ? rest_init+0xd0/0xd0
[    0.025698][    T1]  ret_from_fork_asm+0x11/0x20
[    0.025698][    T1]  </TASK>
[    0.025698][    T1] Modules linked in:
[    0.025698][    T1] CR2: ffff9e9b28736a3b
[    0.025698][    T1] ---[ end trace 0000000000000000 ]---
[    0.025698][    T1] RIP: 0010:setup_real_mode+0x10c/0x1c0
[    0.025698][    T1] Code: 8b 58 0c 48 03 1d 34 c0 55 ff 0f 32 49 89 d6 49 c1 e6 20 49 09 c6 0f 1f 44 00 00 49 81 e6 ff fb ff ff 48 8d 43 10 48 8d 4b 18 <4c> 89 73 08 48 c7 03 60 00 e0 83 48 89 05 52 c0 18 00 8b 05 34 c0
[    0.025698][    T1] RSP: 0018:ffffaa4180013b88 EFLAGS: 00010202
[    0.025698][    T1] RAX: ffff9e9b28736a43 RBX: ffff9e9b28736a33 RCX: ffff9e9b28736a4b
[    0.025698][    T1] RDX: 0000000000000000 RSI: 0000000000000022 RDI: 0000000000002267
[    0.025698][    T1] RBP: ffffaa4180013ed0 R08: 0000000000000e17 R09: ffff9e9afdc32860
[    0.025698][    T1] R10: ffff9e9afeeddcc0 R11: ffffffff85791010 R12: ffffffff852105d8
[    0.025698][    T1] R13: 0000000000000000 R14: 0000000000200901 R15: 0000000000000000
[    0.025698][    T1] FS:  0000000000000000(0000) GS:ffff9e9afdc00000(0000) knlGS:0000000000000000
[    0.025698][    T1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.025698][    T1] CR2: ffff9e9b28736a3b CR3: 0000000039418000 CR4: 0000000000750ef0
[    0.025698][    T1] PKRU: 55555554
[    0.025698][    T1] note: swapper/0[1] exited with irqs disabled
[    0.025701][    T1] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    0.026337][    T1] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---
yshui commented 9 months ago

btw clang version is 17.0.6

yshui commented 9 months ago

if I use -cpu host -enable-kvm, qemu takes 100% cpu and there is no output in console.

yshui commented 9 months ago

Turns out I just threw in the towel a bit too quickly. I missed one spot:

diff --git a/arch/x86/realmode/rm/Makefile b/arch/x86/realmode/rm/Makefile
index f614009d3e4e2..4b42006d9ce02 100644
--- a/arch/x86/realmode/rm/Makefile
+++ b/arch/x86/realmode/rm/Makefile
@@ -50,7 +50,7 @@ $(obj)/pasyms.h: $(REALMODE_OBJS) FORCE
 targets += realmode.lds
 $(obj)/realmode.lds: $(obj)/pasyms.h

-LDFLAGS_realmode.elf := -m elf_i386 --emit-relocs -T
+LDFLAGS_realmode.elf := --no-dynamic-linker -m elf_i386 --emit-relocs -T
 CPPFLAGS_realmode.lds += -P -C -I$(objtree)/$(obj)

 targets += realmode.elf

With this the kernel boots! I guess this is enough of a workaround while we wait for the lld bug to be fixed.

yshui commented 9 months ago

i feel i might know what's happening here. without --no-dynamic-linker lld puts a .interp into the file even when ~the linker script doesn't include one~ ld.bfd won't.

See llvm/llvm-project#78873

yshui commented 9 months ago

OK, proposal to change the behavior of lld was rejected, the remaining choice is to add --no-dynamic-linker to kernel LDFLAGS.

nickdesaulniers commented 9 months ago

-KBUILD_CFLAGS += -fno-strict-overflow

yikes! Don't do that!

i feel i might know what's happening here. without --no-dynamic-linker lld puts a .interp into the file even when the linker script doesn't include one ld.bfd won't.

There's also the option to add .interp to the DISCARDS sections of the existing linker scripts. That's a pretty clean solution IMO.

yshui commented 9 months ago

yikes! Don't do that!

nix puts -fwrapv in cflags automatically, which makes clang report -fno-strict-overflow as unused. anyway this patch isn't relevant here :sweat_smile:

add .interp to the DISCARDS

good idea! how many linker scripts do we need to change?

hmm, looks like each arch has itself own linker script...

yshui commented 9 months ago

@nickdesaulniers ok, how can i move this forward? do i submit a patch to kernel mailing list.

i am kind of worried that if i explain this is needed because NixOS is designed weirdly and needs to add some stupid linker flags when building anything, Linus is going to be very mad.

nathanchance commented 9 months ago

@yshui Yes, you could submit a patch upstream for this. There is precedent for adding flags to workaround behavior with distribution versions of toolchains, such as -fno-PIE because distribution versions default to PIE.

yshui commented 9 months ago

Submitted https://lore.kernel.org/llvm/20240208012057.2754421-2-yshuiv7@gmail.com/T/#u

:crossed_fingers: