dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.44k stars 4.76k forks source link

[API Proposal]: Arm64: FEAT_SVE: loads pt 2 #97831

Open tannergooding opened 9 months ago

tannergooding commented 9 months ago

Continuing from https://github.com/dotnet/runtime/issues/94006

namespace System.Runtime.Intrinsics.Arm

/// VectorT Summary
public abstract class Sve : AdvSimd /// Feature: FEAT_SVE  Category: loads
{
    /// T: float, double, sbyte, short, int, long, byte, ushort, uint, ulong
    public static unsafe (Vector<T>, Vector<T>) LoadVectorx2(Vector<T> mask, T* address); // LD2W or LD2D or LD2B or LD2H

    /// T: float, double, sbyte, short, int, long, byte, ushort, uint, ulong
    public static unsafe (Vector<T>, Vector<T>, Vector<T>) LoadVectorx3(Vector<T> mask, T* address); // LD3W or LD3D or LD3B or LD3H

    /// T: float, double, sbyte, short, int, long, byte, ushort, uint, ulong
    public static unsafe (Vector<T>, Vector<T>, Vector<T>, Vector<T>) LoadVectorx4(Vector<T> mask, T* address); // LD4W or LD4D or LD4B or LD4H

    /// T: byte, sbyte
    public static unsafe void Prefetch8Bit(Vector<T> mask, void* address, SvePrefetchType op); // PRFB

    /// T: short, ushort
    public static unsafe void Prefetch16Bit(Vector<T> mask, void* address, SvePrefetchType op); // PRFH

    /// T: int, uint
    public static unsafe void Prefetch32Bit(Vector<T> mask, void* address, SvePrefetchType op); // PRFW

    /// T: long, ulong
    public static unsafe void Prefetch64Bit(Vector<T> mask, void* address, SvePrefetchType op); // PRFD
}
ghost commented 9 months ago

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics See info in area-owners.md if you want to be subscribed.

Issue Details
Continuing from https://github.com/dotnet/runtime/issues/94006 ```csharp namespace System.Runtime.Intrinsics.Arm /// VectorT Summary public abstract class Sve : AdvSimd /// Feature: FEAT_SVE Category: loads { /// T: float, double, sbyte, short, int, long, byte, ushort, uint, ulong public static unsafe (Vector, Vector) LoadVectorx2(Vector mask, const T *base); // LD2W or LD2D or LD2B or LD2H /// T: float, double, sbyte, short, int, long, byte, ushort, uint, ulong public static unsafe (Vector, Vector, Vector) LoadVectorx3(Vector mask, const T *base); // LD3W or LD3D or LD3B or LD3H /// T: float, double, sbyte, short, int, long, byte, ushort, uint, ulong public static unsafe (Vector, Vector, Vector, Vector) LoadVectorx4(Vector mask, const T *base); // LD4W or LD4D or LD4B or LD4H public static unsafe void PrefetchBytes(Vector mask, const void *base, enum SvePrefetchType op); // PRFB public static unsafe void PrefetchInt16(Vector mask, const void *base, enum SvePrefetchType op); // PRFH public static unsafe void PrefetchInt32(Vector mask, const void *base, enum SvePrefetchType op); // PRFW public static unsafe void PrefetchInt64(Vector mask, const void *base, enum SvePrefetchType op); // PRFD } ```
Author: tannergooding
Assignees: -
Labels: `area-System.Runtime.Intrinsics`, `api-ready-for-review`
Milestone: -
bartonjs commented 9 months ago

Video

namespace System.Runtime.Intrinsics.Arm;

/// VectorT Summary
public abstract class Sve : AdvSimd /// Feature: FEAT_SVE  Category: loads
{
    /// T: float, double, sbyte, short, int, long, byte, ushort, uint, ulong
    public static unsafe (Vector<T>, Vector<T>) Load2xVector(Vector<T> mask, const T *address); // LD2W or LD2D or LD2B or LD2H

    /// T: float, double, sbyte, short, int, long, byte, ushort, uint, ulong
    public static unsafe (Vector<T>, Vector<T>, Vector<T>) Load3xVector(Vector<T> mask, const T *address); // LD3W or LD3D or LD3B or LD3H

    /// T: float, double, sbyte, short, int, long, byte, ushort, uint, ulong
    public static unsafe (Vector<T>, Vector<T>, Vector<T>, Vector<T>) Load4xVector(Vector<T> mask, const T *address); // LD4W or LD4D or LD4B or LD4H

    /// T: byte, sbyte
    public static unsafe void Prefetch8Bit(Vector<T> mask, void *address, [ConstantExpected] SvePrefetchType prefetchType); // PRFB

    /// T: short, ushort
    public static unsafe void Prefetch16Bit(Vector<T> mask, void *address, [ConstantExpected] SvePrefetchType prefetchType); // PRFH

    /// T: int, uint
    public static unsafe void Prefetch32Bit(Vector<T> mask, void *address, [ConstantExpected] SvePrefetchType prefetchType); // PRFW

    /// T: long, ulong
    public static unsafe void Prefetch64Bit(Vector<T> mask, void *address, [ConstantExpected] SvePrefetchType prefetchType); // PRFD
}
KennethHoff commented 9 months ago

I realize this is approved, but while watching the video, I thought of this. Is there a reason this was not considered? This follows the naming convention of {num1}x{num2}, but maybe {num1} does not relate to 1

namespace System.Runtime.Intrinsics.Arm;

public abstract class Sve : AdvSimd
{
-    public static unsafe (Vector<T>, Vector<T>) Load2xVector(Vector<T> mask, const T *address);
+    public static unsafe (Vector<T>, Vector<T>) LoadVector1x2(Vector<T> mask, const T *address);

-    public static unsafe (Vector<T>, Vector<T>, Vector<T>) Load3xVector(Vector<T> mask, const T *address);
+    public static unsafe (Vector<T>, Vector<T>, Vector<T>) LoadVector1x3(Vector<T> mask, const T *address);

-    public static unsafe (Vector<T>, Vector<T>, Vector<T>, Vector<T>) Load4xVector(Vector<T> mask, const T *address)
+    public static unsafe (Vector<T>, Vector<T>, Vector<T>, Vector<T>) LoadVector1x4(Vector<T> mask, const T *address)
}
tannergooding commented 9 months ago

Vector1 is too confusing due to the existence of Vector2/3/4 (number of elements) and Vector64/128/256/512 (number of bits).