llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.78k stars 11.9k forks source link

X86: stack realignment, dynamic allocas, and inline assembly cause conflict over ebx / esi #17204

Closed rnk closed 2 months ago

rnk commented 11 years ago
Bugzilla Link 16830
Version trunk
OS Windows NT
Blocks llvm/llvm-project#21794 llvm/llvm-project#24719 llvm/llvm-project#13712
CC @asl,@efriedma-quic,@duck-37

Extended Description

So far as I can tell, the only register you can't touch in MS inline asm is ebp, but LLVM's x86 backend requires a BasePtr register which is separate from the frame pointer in ebp. It happens to hard code the choice of esi in X86RegisterInfo.cpp: // Use a callee-saved register as the base pointer. These registers must // not conflict with any ABI requirements. For example, in 32-bit mode PIC // requires GOT in the EBX register before function calls via PLT GOT pointer. BasePtr = Is64Bit ? X86::RBX : X86::ESI;

This will blow up if the inline asm clobbers esi.

Test case straight from LLVM's own lib/Support/Host.cpp:

bool GetX86CpuIDAndInfo(unsigned value, unsigned rEAX, unsigned rEBX, unsigned rECX, unsigned rEDX) { __asm { mov eax,value cpuid mov esi,rEAX mov dword ptr [esi],eax mov esi,rEBX mov dword ptr [esi],ebx mov esi,rECX mov dword ptr [esi],ecx mov esi,rEDX mov dword ptr [esi],edx } return false; }

This generates x86 asm like:

$ clang -cc1 -fms-compatibility t.cpp -o - -cxx-abi microsoft -S ... "?GetX86CpuIDAndInfo@@YA_NIPAI000@Z": pushl %ebp movl %esp, %ebp pushl %ebx pushl %edi pushl %esi subl $28, %esp movl %esp, %esi ...

APP

    .intel_syntax
    mov eax,dword ptr [esi + 24]
    cpuid
    mov esi,dword ptr [esi + 20]
    mov dword ptr [esi],eax
    mov esi,dword ptr [esi + 16]
    mov dword ptr [esi],ebx
    mov esi,dword ptr [esi + 12]
    mov dword ptr [esi],ecx
    mov esi,dword ptr [esi + 8]
    mov dword ptr [esi],edx
    .att_syntax
    #NO_APP
efriedma-quic commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#51100

rnk commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#30389

rnk commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#27183

llvmbot commented 2 years ago

mentioned in issue llvm/llvm-project#24719

rnk commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#22068

stephenhines commented 2 years ago

mentioned in issue llvm/llvm-project#21794

rnk commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#17907

rnk commented 3 years ago

Sure, it's easy to penalize access to overaligned locals, but I think they tend to be performance critical, so IMO it's better to penalize accesses to arguments. I assume the base pointer work was specifically done to address this issue.

efriedma-quic commented 3 years ago

If we're going to move the alignment gap so it isn't between the frame pointer and the stack pointer, we might as well just construct a single virtual register that pointers to the argument area, instead of using a separate virtual register for each argument.

I suspect adding an indirection to access overaligned locals is actually the easiest approach for most targets, at least ones that don't need overaligned spill slots. We already have LocalStackSlotAllocation, which does a similar transform.

rnk commented 3 years ago

To recap, there are three stack areas that the compiler may need to reference:

  1. incoming parameters and fixed stack objects (return addr)
  2. local variable stack objects (potentially highly aligned)
  3. outgoing arguments

Overaligned variables make the offset from 1 to 2 dynamic, and allocas make the offset from 2 to 3 dynamic. LLVM's strategy today is to dedicate a physical register to all areas when needed.

Fixing this issue fully requires adding indirection to accesses to one of these areas. IMO, area #​1, arguments, is the best option, especially given other LLVM design choices. To implement that, we should:

efriedma-quic commented 3 years ago

I looked at this briefly since a duplicate was filed, bug 51100.

Apparently, gcc doesn't use a base pointer at all. It uses different techniques on different targets; on x86, it emits an exotic prologue that actually stores the frame pointer below the realignment gap. On ARM, gcc apparently just never realigns the stack at all.

efriedma-quic commented 3 years ago

Bug llvm/llvm-bugzilla-archive#51100 has been marked as a duplicate of this bug.

rnk commented 8 years ago

Bug llvm/llvm-bugzilla-archive#30389 has been marked as a duplicate of this bug.

rnk commented 8 years ago

Bug llvm/llvm-bugzilla-archive#27183 has been marked as a duplicate of this bug.

rnk commented 9 years ago

Re-titling since this is generic across OSs and asm style.

rnk commented 9 years ago

I think we can safely use ebx on win32 - PIC is implemented w/o GOT there, so we can easily use ebx.

That's just a workaround, though. We will still conflict with inline asm using ebx on Windows (rare, due to MSVC's use of ebx) and esi on Linux (not uncommon due to string instructions).

I think we need to reformulate the problem, rather than trying to pick an arbitrary hardcoded register that works for all situations. Stack objects (allocas, spill slots) with low alignment can be accessed via ebp, and stack objects with large alignment requirements can be accessed via a virtual register, which can be spilled via ebp. Then we can let the register allocator solve the problem.

asl commented 9 years ago

I think we can safely use ebx on win32 - PIC is implemented w/o GOT there, so we can easily use ebx.

rnk commented 9 years ago

Bug llvm/llvm-bugzilla-archive#22068 has been marked as a duplicate of this bug.

rnk commented 10 years ago

MSVC uses ebx instead of esi, for roughly the same reasons, although the roles are swapped. ebx points to the incoming arguments, and ebp points to the local variables. ebx is callee saved in most CCs, so it can be used across the body of the function.

MSVC emits a warning when modifying ebp, and ebx if it had to use it due to stack realignment:

int main() { declspec(align(8)) int a; asm { push ebp mov ebx, esp and esp, -16 mov a, edi mov esp, ebx pop ebp } return a; }

t.cpp(5) : warning C4731: 'main' : frame pointer register 'ebx' modified by inline assembly code t.cpp(9) : warning C4731: 'main' : frame pointer register 'ebp' modified by inline assembly code

If you remove the need for stack realignment, only the ebp warning remains. You can modify ebp without warning, but only if you enable optimizations and the FP can be eliminated.

We could teach clang to emit a diagnostic like this.