dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.27k stars 4.73k forks source link

Make RuntimeHelpers.IsBitwiseEquatable public #46017

Closed EgorBo closed 2 years ago

EgorBo commented 3 years ago

Background and Motivation

This proposal makes RuntimeHelpers.IsBitwiseEquatable API public.

It allows users to optimize various comparisons with super fast memcmp/memset-like APIs when input T is bitwise comparable (e.g. byte, int, some struct without gc fields and holes in its layout)

Proposed API

namespace System.Runtime.CompilerServices
{
    public static partial class RuntimeHelpers
    {
        [Intrinsic]
-       internal static bool IsBitwiseEquatable<T>()
+       public static bool IsBitwiseEquatable<T>()
        {
            // The body of this function will be replaced by the EE with unsafe code!!!
            // See getILIntrinsicImplementationForRuntimeHelpers for how this happens.
            throw new InvalidOperationException();
        }

Usage Examples

A good example is to use it in System.Linq.dll for SequnceEquals. I noticed that people still widely use it to compare arrays of primitives, e.g.:

byte[] array1 = ...;
byte[] array2 = ...;

bool equals = array1.SequenceEqual(array2); // super slow
bool equals = array1.AsSpan().SequenceEqual(array2); // ~50x faster for large arrays (e.g. int[100])

// BTW, do we have a roslyn analyzer for it?

(benchmark ^: https://gist.github.com/EgorBo/4e151a32b10d997937fd212ab842450c) I understand that we promote spans for performance but I believe we can optimize the general LINQ API almost for free in this case.

With RuntimeHelpers.IsBitwiseEquatable public we can make the following change to the LINQ's SequnceEquals to make it super fast for such bitwise comparable TSource types:

    public static bool SequenceEqual<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource>? comparer)
    {
-       if (comparer == null)
-       {
-           comparer = EqualityComparer<TSource>.Default;
-       }

        if (first == null)
        {
            ThrowHelperThrowArgumentNullException();
        }

        if (second == null)
        {
            ThrowHelperThrowArgumentNullException();
        }

+       if (RuntimeHelpers.IsReferenceOrContainsReferences<TSource>() && 
+           RuntimeHelpers.IsBitwiseEquatable<TSource>() && 
+           // ^ this call is optimized into "false" for any unknown or not suitable T (and the whole block is removed)
+           comparer == null && first is TSource[] firstArr && second is TSource[] secondArr)
+       {
+           if (firstArr.Length == secondArr.Length)
+               return false;
+
+           if (firstArr.Length == 0)
+               return true;
+
+           ref byte firstArrStart = ref Unsafe.As<TSource, byte>(ref firstArr[0]);
+           ref byte secondArrStart = ref Unsafe.As<TSource, byte>(ref firstArr[0]);
+           int length = firstArr.Length * Unsafe.SizeOf<TSource>(); // TODO: may overflow

+           return MemoryMarshal.CreateSpan(ref firstArrStart, length)
+                     .SequenceEqual(MemoryMarshal.CreateSpan(ref secondArrStart, length));
+       }
+
+       if (comparer == null)
+       {
+          comparer = EqualityComparer<TSource>.Default;
+       }
GrabYourPitchforks commented 3 years ago

I don't think we need to make IsBitwiseEquatable public to get this optimization. If the input parameters are contiguous buffers (T[] or List<T>) and the comparer is null, shouldn't we simply always defer to MemoryExtensions.SequenceEqual? Let that method figure out the most optimal way of comparing two contiguous buffers for equality; the caller shouldn't need to know anything about its implementation.

Edit: bitwise equatable-agnostic logic would look something like this, which is a slight modification on your proposal:

if (comparer is null)
{
    ROS<T> span1;
    if (first is T[] arr) span1 = arr;
    else if (first is List<T> list) span1 = CollectionsMarshal.Whatever(list);
    else goto SlowPath;

    ROS<T> span2;
    if (second is T[] arr) span2 = arr;
    else if (second is List<T> list) span2 = CollectionsMarshal.Whatever(list);
    else goto SlowPath;

    return span1.SequenceEqual(span2);
}

SlowPath:

/* existing logic */
EgorBo commented 3 years ago

@GrabYourPitchforks LINQ's SequenceEqual doesn't define any constraints on TSource in order to be able to use in MemoryExtensions.SequenceEqual (needs where T : IEquatable<T>)

GrabYourPitchforks commented 3 years ago

Then we should look into removing the constraint on MemoryExtensions.SequenceEqual for parity with the LINQ routine, or we should invest in ways to make the compiler and runtime able to call generic-constrained methods from non-generic-constrained methods. (This isn't the first time these constraints have bitten us.) I'd rather not expose an API that requires the caller to drop down to unsafe code when possibly safer alternatives are available.

GrabYourPitchforks commented 3 years ago

Aside - if we do end up making IsBitwiseEquatable public (either for this or for another reason), we should consider ways in which library authors can attribute their own types as bitwise equatable. Currently the method only special-cases a handful of types within corelib.

EgorBo commented 3 years ago

which library authors can attribute their own types as bitwise equatable

Ah I had it in the description but decided to remove. I proposed to extend the current implementation of IsBitwiseEquatable to also check !mt->IsNotTightlyPacked bit and check that Equals is not overriden like it's already done for ValueType.GetHashCode so we can mark any struct without gc fields and holes between them as bitwise equatable in case if Equals is not overriden.

Timovzl commented 3 years ago

Then we should look into removing the constraint on MemoryExtensions.SequenceEqual for parity with the LINQ routine [...]

As of .NET 6, MemoryExtensions.SequenceEqual has unconstrained overloads. 😄 That might help move this issue along.

As for piggybacking on MemoryExtensions.SequenceEqual, that is great. Besides arrays and lists, we might want to include ImmutableArray<T> as one of the types special-cased to go through MemoryExtensions.SequenceEqual. It is a great type for wannabe-constants (i.e. immutable static readonly collections), since (being a struct) it avoids extra indirections and virtual calls. Special-casing it here seems in line with its use cases and its performance-oriented nature.

Currently [IsBitwiseEquatable] only special-cases a handful of types within corelib.

A more dynamic implementation of IsBitwiseEquatable<T> would definitely help. Currently, even a custom struct wrapping just an int, with no equality or hash code overrides whatsoever, is not considered bitwise-equatable, unfortunately. (Tested in .NET 5.)

Either of the following would be great:

Timovzl commented 3 years ago

which library authors can attribute their own types as bitwise equatable

I just realized that having such an attribute might actually be far superior to other solutions.

Without the attribute, we would have to rely on Equals not being overridden. That seems unsuitable, because anyone implementing a struct and caring about performance is likely to implement IEquatable<T>. Such a person is highly likely to then override Equals(object) to match, for correctness and clarity of intent. As a result, the advantages of the solution are lost.

Rather, an attribute (only permitted on unmanaged types) would grant access to the performance improvements without obscure omissions of Equals overrides. The attribute could determine if the type is tightly packed. MemoryMarshal.AsBytes() might be useful to achieve the bitwise comparison.

stephentoub commented 2 years ago

With RuntimeHelpers.IsBitwiseEquatable public we can make the following change to the LINQ's SequnceEquals to make it super fast for such bitwise comparable TSource types:

Enumerable.SequenceEqual does always just delegate to MemoryExtensions.SequenceEqual now if we extract an array: https://github.com/dotnet/runtime/blob/c3cc9fd5d64df359c54b6d0518db75641e04bbc3/src/libraries/System.Linq/src/System/Linq/SequenceEqual.cs#L27-L30 using the unconstrained MemoryExtensions.SequenceEqual overload.

Is there anything left to do for this issue or can it be closed?

EgorBo commented 2 years ago

With RuntimeHelpers.IsBitwiseEquatable public we can make the following change to the LINQ's SequnceEquals to make it super fast for such bitwise comparable TSource types:

Enumerable.SequenceEqual does always just delegate to MemoryExtensions.SequenceEqual now if we extract an array:

https://github.com/dotnet/runtime/blob/c3cc9fd5d64df359c54b6d0518db75641e04bbc3/src/libraries/System.Linq/src/System/Linq/SequenceEqual.cs#L27-L30

using the unconstrained MemoryExtensions.SequenceEqual overload. Is there anything left to do for this issue or can it be closed?

Wow, didn't know that, at the point of creating this issue only byte was handled. Nice!