Closed bwendling closed 1 year ago
If we take arch/x86/kernel/probe_roms.c
, the call to find_oprom()
takes one argument. In the cases where it's called, the parameter is the same as one from the function that's calling it. As such, Clang seems to think it can omit saving the %rdi
register, because they're supposedly going to remain the same when the function exits. But we're zeroing it out regardless.
My current hypothesis:
The function that's clearing the %edi register is static. I think that Clang looks at it and says to itself, "Hmm..Ya know, I can create pretty much any calling convention I want to with this thing, because no one will even know. MUWAHAHAHAHA!" And so even though the %edi register is caller-saved, it doesn't get saved because Clang determines either that it won't matter afterwards, or the value will be the original %edi value upon exit. If it were properly stored, then it would have been restored before return and the zeroing stuff wouldn't have been emitted.
This may be an issue with specifying the proper constraints for ASM blocks that are used. It would also explain why removing the attribute from a seemingly insignificant function can cause the failure. Here's a patch that appears to work:
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 89df6c6617f5..0937cb110e99 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -415,7 +415,7 @@ int paravirt_disable_iospace(void);
#define PVOP_CALL_CLOBBERS PVOP_VCALL_CLOBBERS, "=a" (__eax)
/* void functions are still allowed [re]ax for scratch */
-#define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
+#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), "=D" (__edi)
#define PVOP_CALLEE_CLOBBERS PVOP_VCALLEE_CLOBBERS
#define EXTRA_CLOBBERS , "r8", "r9", "r10", "r11"
That diff works for me.
Tested-by: Nathan Chancellor <nathan@kernel.org>
@nathanchance Thank you. sigh I didn't get the correct root cause...
@nathanchance I cleaned up the patch. Could you run it through another test please? I'll submit it once you're okay with it.
commit 28e6f7f0bc0b8622398dabd37d4c44a231e47522 (HEAD -> linux/upstream)
Author: Bill Wendling <morbo@google.com>
Date: Fri Sep 2 12:11:59 2022 -0700
x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
The ZERO_CALL_USED_REGS feature may zero out caller-saved registers
before returning.
In spurious_kernel_fault(), the "pte_offset_kernel()" call results in
this assembly code:
.Ltmp151:
#APP
# ALT: oldnstr
.Ltmp152:
.Ltmp153:
.Ltmp154:
.section .discard.retpoline_safe,"",@progbits
.quad .Ltmp154
.text
callq *pv_ops+536(%rip)
.Ltmp155:
.section .parainstructions,"a",@progbits
.p2align 3, 0x0
.quad .Ltmp153
.byte 67
.byte .Ltmp155-.Ltmp153
.short 1
.text
.Ltmp156:
# ALT: padding
.zero (-(((.Ltmp157-.Ltmp158)-(.Ltmp156-.Ltmp152))>0))*((.Ltmp157-.Ltmp158)-(.Ltmp156-.Ltmp152)),144
.Ltmp159:
.section .altinstructions,"a",@progbits
.Ltmp160:
.long .Ltmp152-.Ltmp160
.Ltmp161:
.long .Ltmp158-.Ltmp161
.short 33040
.byte .Ltmp159-.Ltmp152
.byte .Ltmp157-.Ltmp158
.text
.section .altinstr_replacement,"ax",@progbits
# ALT: replacement 1
.Ltmp158:
movq %rdi, %rax
.Ltmp157:
.text
#NO_APP
.Ltmp162:
testb $-128, %dil
The "testb" here is using %dil, but the %rdi register was cleared before
returning from "callq *pv_ops+536(%rip)". Adding the proper constraints
results in the use of a different register:
movq %r11, %rdi
# Similar to above.
testb $-128, %r11b
Link: https://github.com/KSPP/linux/issues/192
Signed-off-by: Bill Wendling <morbo@google.com>
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index f04157456a49..f79316538ddc 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -414,8 +414,17 @@ int paravirt_disable_iospace(void);
"=c" (__ecx)
#define PVOP_CALL_CLOBBERS PVOP_VCALL_CLOBBERS, "=a" (__eax)
-/* void functions are still allowed [re]ax for scratch */
+/*
+ * void functions are still allowed [re]ax for scratch.
+ *
+ * The ZERO_CALL_USED REGS feature may end up zeroing out callee-saved
+ * registers. Make sure we model this with the appropriate clobbers.
+ */
+#ifndef CONFIG_ZERO_CALL_USED_REGS
#define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
+#else
+#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), PVOP_VCALL_CLOBBERS
+#endif
#define PVOP_CALLEE_CLOBBERS PVOP_VCALLEE_CLOBBERS
#define EXTRA_CLOBBERS , "r8", "r9", "r10", "r11"
Could you run it through another test please?
Sure thing, it still works :)
I would swap the branches of the ifndef
and change it to ifdef
like below, I think that is a little easier to read.
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 89df6c6617f5..bc2e1b67319d 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -414,8 +414,17 @@ int paravirt_disable_iospace(void);
"=c" (__ecx)
#define PVOP_CALL_CLOBBERS PVOP_VCALL_CLOBBERS, "=a" (__eax)
-/* void functions are still allowed [re]ax for scratch */
+/*
+ * void functions are still allowed [re]ax for scratch.
+ *
+ * The ZERO_CALL_USED REGS feature may end up zeroing out callee-saved
+ * registers. Make sure we model this with the appropriate clobbers.
+ */
+#ifdef CONFIG_ZERO_CALL_USED_REGS
+#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), PVOP_VCALL_CLOBBERS
+#else
#define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
+#endif
#define PVOP_CALLEE_CLOBBERS PVOP_VCALLEE_CLOBBERS
#define EXTRA_CLOBBERS , "r8", "r9", "r10", "r11"
With that:
Reported-and-tested-by: Nathan Chancellor <nathan@kernel.org>
Done. Thank you!
Sent in a patch for review. I expect some pushback, because in the case where the trivial alternative is chosen it will have a superfluous move of RDI into another register. I don't know of quite how to remove that, since it's a direct consequence of using asm blocks instead of allowing the compiler to do its thing. In the case of alternative blocks that's a valid use, but it does result in the superfluous instructions. (note it's only if the value is used after the block. If it's dead, then it doesn't emit anything.)
Was this just fixed with: https://reviews.llvm.org/D133946 https://github.com/llvm/llvm-project/issues/57692?
No, I can still reproduce this issue even with that patch.
During the Clang implementation of
-fzero-call-used-regs
, @nathanchance came across an issue where a Clang-built kernel failed to boot. It was seen as a issue, but is probably more pernicious than we first thought.Related to #84.
Steps to replicate (by @nathanchance).
We uses Fedora for ease of installation via the serial console. @nathanchance wrote a Python script to drive QEMU specifically for testing. The latest version can be found here:
With that, the setup command can be run to initially configure the machine.
This downloads the
.iso
file in to avm
folder in the same folder ascbl_vmm.py
. Once it's finished downloading, QEMU should fire up, printing the full command it uses.Next, you should see grub. Hit '
e
' to addconsole=ttyS0
to thelinuxefi
line afterquiet
then hitCtrl-X
to boot.After a bunch of text flies across the screen, you should be prompted to either use VNC or text mode for configuration. I usually use
text mode
because it is a little simpler. The console might look a bit garbled, hittingr
thenEnter
a few times should clear things up enough to see. In text mode, I customized the time zone and user creation sections but leave the installation source, software selection, and network configuration. Don't touch the root account. Instead create a user account that's the same as yours on the host and give it a password of some sort (i.e. don't just leave it blank).The installation destination has to be run through but leave everything on the default. After the installation process finishes, the machine should reboot.
The script adds a forwarding rule for the virtual machine's SSH port (
22
) to the host (8022
) so you should be able to log into the machine like:Once you are in, you'll need to do three things to make booting self compiled kernels through QEMU possible (as opposed to generating a kernel package and installing it within the machine).
Copy the contents of
/proc/cmdline
somewhere you can use later, aside from theBOOT_IMAGE
variable.Generate an
initramfs
and copy it out of the machine.Within the VM:
Outside of the VM:
Get a list of the currently running kernel modules (it is easier to generate outside of the VM):
Once those are done, exit QEMU and go into your kernel source.
First, we'll use Fedora's kernel configuration as a base:
Next, we're going to build all the modules that we need to boot the VM into the kernel image with the '
localyesconfig
' target (I have not found a clean way to generate a Fedorainitramfs
with modules) then run 'menuconfig
' to turn onCONFIG_ZERO_CALL_USED_REGS
then finally build thebzImage
. The 'localyesconfig
' step might prompt you for some values, I usually just hit<Enter>
to choose the defaults.Once the image is done, you should be able to fire up the VM with the pieces we generated earlier (I usually do this in a different terminal/tmux tab):
That hangs for me with
CONFIG_ZERO_CALL_USED_REGS=y
. If it is disabled, the machine fully boots.The script supports
-g
, which will pass-s -S
to QEMU for easier debugging with GDB, you'll just need to make sure to enableCONFIG_GDB_SCRIPTS
during themenuconfig
stage and add thescripts_gdb
target to your make command (or just useall
in lieu ofbzImage
).