llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.24k stars 12.07k forks source link

Reserved frame pointers are broken on x86 #117178

Open brandtbucher opened 1 day ago

brandtbucher commented 1 day ago

The "frame-pointer"="reserved" attribute is useful for compiling functions that don't want the overhead of saving, setting, and restoring frame pointers in their prologue and epilogue, but still want to guarantee that any existing chain of frame pointers is kept valid (by reserving the frame pointer register). Frame-pointer-based unwinders see these functions as if they are part of the previous frame, and can unwind through them.

On x86, however, this attribute is incorrectly ignored, and rbp is incorrectly used as a scratch register. After some digging, I found that the following change appears to fix the bug:

diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index 50db211c99d8..9b8652b7e302 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -563,7 +563,7 @@ BitVector X86RegisterInfo::getReservedRegs(const MachineFunction &MF) const {
     Reserved.set(SubReg);

   // Set the frame-pointer register and its aliases as reserved if needed.
-  if (TFI->hasFP(MF)) {
+  if (TFI->hasFP(MF) || MF.getTarget().Options.FramePointerIsReserved(MF)) {
     if (MF.getInfo<X86MachineFunctionInfo>()->getFPClobberedByInvoke())
       MF.getContext().reportError(
           SMLoc(),

For context, in CPython's new JIT, we plan to use the feature to support unwinding through JIT frames while not introducing additional runtime overhead. Our JIT works by concatenating pre-compiled templates that tail-call into each other; because of this, the overhead of saving and restoring frame pointers is significant. However, this overhead drops to 0% if the frame pointer register is set once upon entry into JIT code, and then reserved in the templates themselves (for more info, see this CPython issue, and this comment in particular). We'd appreciate it if this were considered for backporting to LLVM 19 as a bugfix.

I'm not familiar enough with the x86 backend to know if additional changes are required or not to do this properly, but I can vouch that the above fix is sufficient for our purposes (unwinding support with 0% slowdown). llc --frame-pointer=reserved, clang -Xclang -mframe-pointer=reserved, and LLVM IR's "frame-pointer"="reserved" attribute all seem to work locally after the fix.

llvmbot commented 1 day ago

@llvm/issue-subscribers-backend-x86

Author: Brandt Bucher (brandtbucher)

The `"frame-pointer"="reserved"` attribute is useful for compiling functions that don't want the overhead of saving, setting, and restoring frame pointers in their prologue and epilogue, but still want to guarantee that any existing chain of frame pointers is kept valid (by reserving the frame pointer register). Frame-pointer-based unwinders see these functions as if they are part of the previous frame, and can unwind through them. On x86, however, this attribute is incorrectly ignored, and `rbp` is incorrectly used as a scratch register. After some digging, I found that the following change appears to fix the bug: ```diff diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp index 50db211c99d8..9b8652b7e302 100644 --- a/llvm/lib/Target/X86/X86RegisterInfo.cpp +++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp @@ -563,7 +563,7 @@ BitVector X86RegisterInfo::getReservedRegs(const MachineFunction &MF) const { Reserved.set(SubReg); // Set the frame-pointer register and its aliases as reserved if needed. - if (TFI->hasFP(MF)) { + if (TFI->hasFP(MF) || MF.getTarget().Options.FramePointerIsReserved(MF)) { if (MF.getInfo<X86MachineFunctionInfo>()->getFPClobberedByInvoke()) MF.getContext().reportError( SMLoc(), ``` For context, in CPython's new JIT, we plan to use the feature to support unwinding through JIT frames while not introducing additional runtime overhead. Our JIT works by concatenating pre-compiled templates that tail-call into each other; because of this, the overhead of saving and restoring frame pointers is significant. However, this overhead drops to 0% if the frame pointer register is set once upon entry into JIT code, and then reserved in the templates themselves (for more info, see [this CPython issue](https://github.com/python/cpython/issues/126910), and [this comment](https://github.com/python/cpython/issues/126910#issuecomment-2488846508) in particular). We'd appreciate it if this were considered for backporting to LLVM 19 as a bugfix. I'm not familiar enough with the x86 backend to know if additional changes are required or not to do this properly, but I can vouch that the above fix is sufficient for our purposes (unwinding support with 0% slowdown). `llc --frame-pointer=reserved`, `clang -Xclang -mframe-pointer=reserved`, and LLVM IR's `"frame-pointer"="reserved"` attribute all seem to work locally after the fix.