32-bit x86: Alignment issues for LL byval params

kinke commented 8 years ago

User code may crash when decorating certain function parameters with an LLVM align attribute for MSVC targets. ~~It may lead to LLVM assertions in the inliner too, I don't remember exactly.~~

For 64-bit MSVC, the issue occurs for the special sret parameter. We currently work around it by not specifying the alignment for the pointer parameter, but still use the alignment when allocating the result. Not an issue anymore with recent LLVM. For 32-bit MSVC, byval pointer parameters are ~~also~~ affected (we don't use byval at all for the Win64 ABI). The resulting user code crashes (when enabling the align attribute) seem to indicate that LLVM isn't aligning the function parameters stack correctly.

Linux and OSX don't have these issues.

Pinging @majnemer. :)

kinke commented 8 years ago

Today's LLVM is able to happily compile all modules (incl. unittests) with all alignment attributes enabled, for both 32-bit and 64-bit MSVC. All release unittests even still pass on x64. But a few unittests crash on x86, namely:

std.complex (SEGFAULT)
std.datetime (SEGFAULT)
std.format (SEGFAULT)

majnemer commented 8 years ago

I don't think the MS ABI really works with aligned byval arguments. For:

struct __declspec(align(8)) S {};
void f(S);
void g() {
  f(S());
}

Clang generates:

target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc18.0.0"

%struct.S = type { [8 x i8] }

define void @"\01?g@@YAXXZ"() #0 {
entry:
  %agg.tmp = alloca %struct.S, align 8
  call void @"\01?f@@YAXUS@@@Z"(%struct.S* byval align 4 %agg.tmp)
  ret void
}

declare void @"\01?f@@YAXUS@@@Z"(%struct.S* byval align 4)

majnemer commented 8 years ago

MSVC won't compile my example. Instead, it will issue error C2719:

'unnamed-parameter': formal parameter with requested alignment of 8 won't be aligned

https://msdn.microsoft.com/en-us/library/373ak2y1.aspx

kinke commented 8 years ago

I was afraid it was that kind of limitation. We could work around that by copying the original argument at the caller's site and passing a pointer to that aligned copy (that's what we do when passing value types > 64bit by value on Win64), at least for extern(D), but then we'd have our own home-brewed Win32 ABI, incompatible with DMD...

majnemer commented 8 years ago

Would inalloca work? inalloca allows you to manually create the argument stack space using an alloca. The alloca's alignment would be appropriately aligned.

kinke commented 8 years ago

Thanks for the hint, that could work (as long as the argument stack base is appropriately aligned too). Is there a reason why LLVM doesn't directly support aligned byval params for MSVC targets (32-bit at least)? I thought LLVM tries to be quite ABI-agnostic and that it's the front-end's responsibility to control the ABI... e.g., we manually implement the System V ABI (partially) for non-Windows x86_64 targets. So if the 32-bit MSVC++ ABI doesn't support aligned byval params, I think the clang front-end shouldn't allow it, but it should still be possible for other languages/front-ends...

majnemer commented 8 years ago

@kinke What happens with GDC/DMD with the following:

  struct __declpsec(align(32)) S { char x[32]; };
  void f(S s, true);
  void g() {
    f(S(), true);
  }

I'd guess the following happens:

The incoming stack pointer gets aligned to 32-bytes.
28 bytes of padding would need to get inserted between the 'bool' and the 'S' in order to get the 'S' value aligned.

kinke commented 8 years ago

While I could figure that out (another time, just passing by before going to bed), let me first say that we're not trying to mimic the ABI of other compilers in all aspects. What you described is what I would expect from LLVM, regardless of the actual target (nit-picking: I'd expect 31 bytes of padding inbetween the single-byte bool and the struct). If the bool was passed in a register, I'd simply expect an aligned function arguments stack base/incoming stack pointer.

kinke commented 2 years ago

druntime module core.int128 introduced in v2.099 uses an align(16) struct Cent, where we're hitting this issue on 32-bit MSVC. When not reversing the args (#3873), according crashes (aligned SSE instructions…) surface for i686 Linux as well, at least with LLVM 13. So byval alignments might not work properly for 32-bit x86 in general.

p0nce commented 1 year ago

Is there a need to workaround this in intel-intrinsics? What should I avoid?

ldc-developers / ldc

32-bit x86: Alignment issues for LL byval params #1356