Open BruceForstall opened 6 years ago
Related: dotnet/runtime#8626
But does ARM has the necessary instruction(s) to implement this? AFAIR I once looked at this and couldn't find anything clear. And ARM codegenlegacy is also NYI.
I see that ARM32 GCC generates a call to a library function for std::atomic_compare_exchange_strong/weak
.
The right answer very well might be that this is something we won't change. In which case, we can remove these TODOs and NYI. The VM helper must do something, though, so it's a matter of "inline" that, or not.
I'm just auditing existing NYI and splitting them into separate issues.
Also CORINFO_INTRINSIC_InterlockedAdd32, CORINFO_INTRINSIC_InterlockedXAdd32, CORINFO_INTRINSIC_InterlockedXchg32 (GT_LOCKADD, GT_XADD, GT_XCHG). See impIntrinsic().
does ARM have the necessary instruction(s) to implement this?
Arm32 would use a load exclusive, store exclusive sequence similar to Arm64 base implementation.
Arm32 would use a load exclusive, store exclusive sequence similar to Arm64 base implementation.
Yes, but it seems that ARM32 has different instructions for this - LDREX
/STREX
instead of LDAXR
/STLXR
. Also, these instructions are only available starting with ARMv7 and only on some profiles: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0008a/CJHBGBBJ.html. The lack of any acquire/release semantics mention in the documentation probably means that DMB is needed as well.
@mikedn dmb is definitely needed with ldrex/strex.
ARMv7 really introduced multiprocessor coherent ARM. So I would believe ldrex/strex were part of that. It would be reasonable to assume that if we are on a multiprocessor OS, then ldrex, strex and dmb must be present.
I would guess all realistic .NET Core targets have these instructions. We can certainly pick up capabilities from the OS like we do for arm64.
So if I understand the ARM documentation correctly the generated code should look something like:
; CompareExchange(r0, r1, r2)
dmb
LOOP:
ldrex r3, [r0]
cmp r3, r1
bne DONE
strex r4, r2, [r0]
tst r4, r4
bne LOOP
DONE:
dmb
; result in r3
Currently, ARM does not generate GT_CMPXCHG nodes; all other targets do.
From the importer:
Also, in codegenarmarch.cpp, genCodeForTreeNode:
category:cq theme:hardware-intrinsics skill-level:intermediate cost:small