dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.91k stars 4.63k forks source link

[API Proposal]: BitwiseAtomic<T> #105054

Open timcassell opened 1 month ago

timcassell commented 1 month ago

Background and motivation

To make it possible to use atomic (Compare)Exchange on any type, including custom structs of varying size. It also makes it possible to use 128-bit atomics which the runtime currently doesn't support (like x64's cmpxchg16b), and atomic exchanges of 2 references at once (which can't even be done in 32-bit runtimes currently).

If the type's size is <= the target architecture's largest atomic (Compare)Exchange instruction (16 bytes for 64-bit, 8 bytes for 32-bit), it will use native atomic instructions. If a type's size is larger than that, the runtime will fallback to a spinlock.

In the case of a padded type, the runtime must ensure that all padded bits are zeroed before performing the atomic operation.

For optimal performance, the runtime should also align the field of the atomic properly according to its size (if possible). I.E. a BitwiseAtomic<ObjectPair> should be aligned on a 16-byte boundary, as opposed to the normal 8-byte boundary in 64-bit processes.

This proposal is different than #17975, which uses IEquatable<T> for equality comparisons, while this proposal uses bitwise equality. Float -0.0 != 0.0. This can result in single instruction operations compared to that proposal which requires CompareExchange loops. And without requiring the IEquatable<T> interface, it will work with any type.

This could also supersede #31911 since we only need 1 new type instead of several.

API Proposal

namespace System.Threading;

public struct BitwiseAtomic<T>
{
    private T _value;
    // SpinLock is only added to this struct by the runtime if sizeof(T) is too large for native atomics.
    // private SpinLock _lock;

    // True if sizeof(T) is <= the target architecture's largest atomic operation.
    [Intrinsic]
    public static bool IsNative { get; }

    // Any padded bits are zeroed.
    [Intrinsic]
    public BitwiseAtomic(T value);
    [Intrinsic]
    public T Exchange(T value);
    [Intrinsic]
    public T CompareExchange(T value, T comparand);

    // We expose separate non-atomic read/write functions so that users can use
    // volatile (with new Volatile.Read/WriteBarrier() APIs) or normal reads/writes
    // if they don't need the cost of the exchange.
    public T ReadNonAtomic();
    // Any padded bits are zeroed.
    [Intrinsic]
    public void WriteNonAtomic(T value);
}

API Usage

private struct ObjectPair<T1, T2> where T1 : class where T2 : class
{
    public T1 obj1;
    public T2 obj2;
}

private BitwiseAtomic<ObjectPair<MyType, AnotherType>> _atom;

...

private struct Bytes
{
    public byte b1;
    public byte b2;
    public byte b3;
}

// Padded byte 4 will be zeroed on 32-bit, padded bytes 4-8 will be zeroed on 64-bit.
private BitwiseAtomic<Bytes> _atom;

...

private struct Numbers
{
    public long num1;
    public int num2;
}

// No padding on 32-bit, padded bytes 5-8 will be zeroed on 64-bit.
private BitwiseAtomic<Numbers> _atom;

...

private struct LargeStruct
{
    public long num1;
    public long num2;
    public long num3;
    public long num4;
}

// The BitwiseAtomic<LargeStruct> struct adds a SpinLock.
private BitwiseAtomic<LargeStruct> _atom;
// Emits cmpxchg16b instruction on x64, uses SpinLock on x86.
if (_atom.CompareExchange(new Numbers() { num1= 1, num2= 2 }, default) == default)
{
    ...
}
// Uses SpinLock.
if (_atom.CompareExchange(new LargeStruct() { num1= 1, num2= 2 }, default) == default)
{
    ...
}

Alternative Designs

No response

Risks

It's a struct, and exposes non-atomic methods, so it could be torn if used improperly. It also has the same risks as the SpinLock struct.

dotnet-policy-service[bot] commented 1 month ago

Tagging subscribers to this area: @mangod9 See info in area-owners.md if you want to be subscribed.

KalleOlaviNiemitalo commented 1 month ago

Any padded bits are zeroed.

What does that do to types like the following -- perhaps zero them entirely?

[StructLayout(LayoutKind.Explicit, Size = 8)]
struct Mystery
{
    // Fields are accessed using unsafe code and are not declared to the CLR.
}

IIRC, the C++/CLI compiler can generate types like that.

colejohnson66 commented 1 month ago

17975?

timcassell commented 1 month ago

Any padded bits are zeroed.

What does that do to types like the following -- perhaps zero them entirely?

[StructLayout(LayoutKind.Explicit, Size = 8)]
struct Mystery
{
    // Fields are accessed using unsafe code and are not declared to the CLR.
}

IIRC, the C++/CLI compiler can generate types like that.

Yeah, I don't think that will work with this API. This API is meant to be "safe", such that any padded bytes that could be random won't affect the equality of the declared fields. You would need to declare a field (or fields) of that size to make it work, or union the type in another struct.

MichalPetryka commented 1 month ago

It's a struct

This type could possibly need to be a ref type with special GC support since 16B cmpxchg requires 16B alignment on all platforms.

timcassell commented 1 month ago

It's a struct

This type could possibly need to be a ref type with special GC support since 16B cmpxchg requires 16B alignment on all platforms.

Is that not enforceable for this struct type? It would be unfortunate to add extra object overhead if it can be avoided.

Or do you mean this type would be specially treated the same as object references, such that it must always be aligned? If so, I can definitely get behind that.

timcassell commented 1 month ago

17975?

I explained in the proposal how this is different than that. Actually, that one could probably be built on top of this.

timcassell commented 1 month ago

Another boon to this type is, for architectures that don't support small atomics (like byte and short), this could widen itself to the smallest atomic instruction size. Unlike Interlocked.CompareExchange(byte, byte, byte) that has to fall back to software emulation.