ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.1k stars 2.49k forks source link

Frame address is generally incorrect on windows and uefi #18662

Open truemedian opened 8 months ago

truemedian commented 8 months ago

This problem came up while I was attempting to implement debug.StackIterator for UEFI, where I must use FP-based unwinding due to the lack of dwarf information.

Frame pointer address is stored incorrectly

Consider the following function preamble from a windows binary (UEFI exhibits the same behavior):

; x86_64-windows
push rbp            ; save frame pointer
sub rsp, 0x130      ; allocate stack space
lea rbp, [rbp+0x80] ; attempt to find frame pointer?

rbp vs rsp

The first observation here is that rbp is definitely the frame pointer, as is the case for non-windows systems. Not shown here are nearly all stack offsets being calculated from rbp instead of rsp.

rbp offset

The second observation here is that rbp does not point to the frame pointer, but rather to 0x80 bytes below the top of the stack. This offset is never greater than 0x80 bytes, even if the stack space allocated is greater than 0x80 bytes.

If the stack space allocated is less than 0x80 bytes, rbp generally points to the frame pointer as expected (there is an edge case I believe is related to other callee saved registers).

The following is an example of a preamble that leaves the frame pointer in rbp as expected:

; x86_64-windows
push rbp            ; save frame pointer
sub rsp, 0x50       ; allocate stack space
lea rbp, [rsp+0x50] ; work backwards to calculate frame pointer

What should be happening

The first thing that needs to be fixed is @frameAddress needs to be updated to return the address of the frame pointer on Windows and UEFI. It currently does not.

There are two ways to fix the rbp offset problem:

  1. Save the frame pointer immediately after pushing it on the stack. This is what all other systems do, it works for any stack allocation size, and doesn't require any weird calculations.

  2. Continue backtracking from the top of the stack, but allow offsets greater than 0x80 bytes, so that the frame pointer is always in rbp.

For example, the following preamble is the same program, but compiled for linux, and it works as expected:

; x86_64-linux
push rbp     ; save frame pointer to stack
mov rbp, rsp ; save address of frame pointer to rbp
sub rsp, 0x30

Conclusion

On Windows and UEFI, the frame pointer cannot be reliably unwound, because all it takes is one function with the frame pointer stored incorrectly to break the rest of the stack trace.

drew-gpf commented 8 months ago

If you want to unwind the stack on x64 PE/fastcall, you generally should be using unwind information instead of looking at frame pointers. That said, this is "intended" LLVM behavior: see https://github.com/llvm/llvm-project/issues/72908

The problem is that frameaddress might not be the right tool for the job here, so to speak. This is because FRAMEADDR on the x86 backend has a special case added on Windows X64, which is needed because of differences in stack frame layout you can see here -- namely, unlike on Linux, rbp on Windows is not the top of the function frame.

See also Microsoft's documentation: vcamd_conv_ex_5

truemedian commented 8 months ago

This not being reliable for existing x64 calling conventions makes sense, but zig's calling convention isn't subject to the same rules. Something is definitely getting very close to saving the frame pointer, it just doesn't always work.

namely, unlike on Linux, rbp on Windows is not the top of the function frame.

I have never observed this to be the case in a zig function, it is always the first register saved, and always at the beginning of the function preamble.

At the very least there should be some clarification as to whether @frameAddress should even be allowed on Windows and UEFI if it is never reliable.

drew-gpf commented 8 months ago

but zig's calling convention isn't subject to the same rules

It must be, or else RtlVirtualUnwind would completely fail on Zig binaries. This means that you would need a Zig-specific debugger to debug your code, for example, because most offsets encoded by unwind codes are unsigned and relative to the selected frame pointer--either explicitly as an (arbitrary) nonvolatile register (e.g. rbp) or as rsp if absent. In practice, Zig binaries emit runtime information in the .pdata section and use the same unwind codes as everybody else. This includes frame pointer behavior. It's possible the compiler is capable of violating these rules, but I've never seen it happen.

If the Zig compiler wanted to support this, I think they could instead use rbp as a nonvol in every function as the "frame pointer" but not indicate it as the actual frame pointer. This means that some functions could have two "frame pointers". This would probably need LLVM support and sounds incredibly wasteful, especially when there already exists the means to unwind x64 stacks, that is, through the exception directory.

I have never observed this to be the case in a zig function, it is always the first register saved, and always at the beginning of the function preamble.

This is usually not true in release builds. Consider this function prologue from a -Doptimize=ReleaseSafe binary generated by 0.12.0-dev.2312+2e7d28dd0: image Here, even though it pushes rbp, it is not immediately used. There is also clearly no frame pointer. Further down the function we see it is a generic nonvolatile register: image ... image ... image In the debug build, however, it seems to love using rbp as a frame pointer: image

At the very least there should be some clarification as to whether @frameAddress should even be allowed on Windows and UEFI if it is never reliable.

This is true to an extent, although I would argue that it should be specifically warned against for the purpose of stack unwinding; again, see the LLVM github issue. The only solution to stack unwinding wrt. x64 is to examine unwind codes from the exception directory.

truemedian commented 8 months ago

I guess that makes more sense, I never looked at the result in a non-debug context. I was hoping I would be saved from having to figure out how the .pdata section is formatted and how to use it to unwind but it would appear it can't be avoided.

@frameAddress returning the stack pointer is still questionable though.