[Swift 5.5 dev] Double wide atomics are currently broken on arm64 in debug builds

lorentey commented 3 years ago

The llvm version in the latest builds of Swift 5.5 generates broken code for double-wide atomic loads on arm64 when compiled without optimizations, indefinitely pegging the CPU at 100% the first time such an operation is executed.

Information

Package version: 1.0.1
Platform version: macOS 12 beta
Swift version: Swift 5.5 (swiftlang-1300.0.29.102 clang-1300.0.28.1) from Xcode 13 beta 5

Checklist

[X] If possible, I've reproduced the issue using the main branch of this package.
[X] I've searched for existing reports of the same issue.

Steps to Reproduce

Run the tests on an arm64 machine with optimizations disabled.

$  swift --version
swift-driver version: 1.26.9 Apple Swift version 5.5 (swiftlang-1300.0.29.102 clang-1300.0.28.1)
Target: arm64-apple-macosx11.0
$ swift test

Expected behavior

The tests finish successfully.

Actual behavior

...
Test Suite 'BasicAtomicDoubleWordTests' started at 2021-09-10 19:53:24.309
Test Case '-[AtomicsTests.BasicAtomicDoubleWordTests test_compareExchange_acquiring]' started.
(never finishes)

The issue boils down to bad codegen:

$ cat hang.c
int main()
{
  _Atomic(__uint128_t) value = 0;
  return (int)value;
}
$ clang -target arm64-apple-macos11 hang.c -S
$ cat hang.S
...
    ldaxp   x9, x10, [x11]
                                        ; kill: def $x0 killed $x10
    mov x0, x9
    str x0, [sp, #8]                    ; 8-byte Folded Spill
    stlxp   w8, x9, x10, [x11]
...

Note the extra str to stack in the middle of the ldaxp/stlxp transaction; IIUC, this makes stlxp fail unconditionally.

lorentey commented 3 years ago

If necessary, I think we'll be able to work around this by hiding 128-bit atomics behind force-optimized opaque functions.

lorentey commented 2 years ago

This is an LLVM regression that's being addressed by https://github.com/llvm/llvm-project/commit/3a00e58c2fca0c20d3792c897ef1ea54b6a168a0.

For the Swift 5.5 toolchain, we can work around this issue by switching to loading double-wide atomics using compare-exchange in debug builds on arm64.

apple / swift-atomics