dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.95k stars 4.65k forks source link

[API Proposal]: Arm64: FEAT_F32MM #94024

Open a74nh opened 10 months ago

a74nh commented 10 months ago
namespace System.Runtime.Intrinsics.Arm;

/// VectorT Summary
public abstract partial class SveF32mm : AdvSimd /// Feature: FEAT_F32MM
{

  public static unsafe Vector<float> MatrixMultiplyAccumulate(Vector<float> op1, Vector<float> op2, Vector<float> op3); // FMMLA // MOVPRFX

  /// total method signatures: 1

}

/// Full API
public abstract partial class SveF32mm : AdvSimd /// Feature: FEAT_F32MM
{
    /// MatrixMultiplyAccumulate : Matrix multiply-accumulate

    /// svfloat32_t svmmla[_f32](svfloat32_t op1, svfloat32_t op2, svfloat32_t op3) : "FMMLA Ztied1.S, Zop2.S, Zop3.S" or "MOVPRFX Zresult, Zop1; FMMLA Zresult.S, Zop2.S, Zop3.S"
  public static unsafe Vector<float> MatrixMultiplyAccumulate(Vector<float> op1, Vector<float> op2, Vector<float> op3);

  /// total method signatures: 1
  /// total method names:      1
}

  /// Total ACLE covered across API:      1
ghost commented 10 months ago

Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.

Issue Details
```csharp namespace System.Runtime.Intrinsics.Arm /// VectorT Summary public abstract class Sve : AdvSimd /// Feature: FEAT_F32MM { public static unsafe Vector MatrixMultiplyAccumulate(Vector op1, Vector op2, Vector op3); /// total method signatures: 1 } /// Full API public abstract class Sve : AdvSimd /// Feature: FEAT_F32MM { /// MatrixMultiplyAccumulate : Matrix multiply-accumulate /// svfloat32_t svmmla[_f32](svfloat32_t op1, svfloat32_t op2, svfloat32_t op3) : "FMMLA Ztied1.S, Zop2.S, Zop3.S" or "MOVPRFX Zresult, Zop1; FMMLA Zresult.S, Zop2.S, Zop3.S" public static unsafe Vector MatrixMultiplyAccumulate(Vector op1, Vector op2, Vector op3); /// total method signatures: 1 /// total method names: 1 } /// Total ACLE covered across API: 1 ```
Author: a74nh
Assignees: -
Labels: `area-System.Numerics`
Milestone: -
a74nh commented 10 months ago

This contributes to https://github.com/dotnet/runtime/issues/93095

It covers all of the instructions in FEAT_F32MM. This an optional 8.2 feature but is not yet available in any hardware.

This list was auto generated from the C ACLE for SVE, and is in three parts:

The methods list reduced down to Vector versions. All possible varaints of T are given above the method. The complete list of all methods. The corresponding ACLE methods and SVE instructions are given above the method. All rejected ACLE methods. These are methods we have agreed that do not need including in C#. Where possible, existing C# naming conventions have been matched.

Many of the C functions include predicate argument(s), of type svbool_t as the first argument. These are missing from the C# method. It is expected that the Jit will create predicates where required, or combine with uses of conditionalSelect(). For more discussion see https://github.com/dotnet/runtime/issues/88140 comment.

a74nh commented 10 months ago

Updated to reflect review comments from other API proposals.

ghost commented 7 months ago

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics See info in area-owners.md if you want to be subscribed.

Issue Details
```csharp namespace System.Runtime.Intrinsics.Arm /// VectorT Summary public abstract class SveF32mm : AdvSimd /// Feature: FEAT_F32MM { public static unsafe Vector MatrixMultiplyAccumulate(Vector op1, Vector op2, Vector op3); // FMMLA // MOVPRFX /// total method signatures: 1 } /// Full API public abstract class SveF32mm : AdvSimd /// Feature: FEAT_F32MM { /// MatrixMultiplyAccumulate : Matrix multiply-accumulate /// svfloat32_t svmmla[_f32](svfloat32_t op1, svfloat32_t op2, svfloat32_t op3) : "FMMLA Ztied1.S, Zop2.S, Zop3.S" or "MOVPRFX Zresult, Zop1; FMMLA Zresult.S, Zop2.S, Zop3.S" public static unsafe Vector MatrixMultiplyAccumulate(Vector op1, Vector op2, Vector op3); /// total method signatures: 1 /// total method names: 1 } /// Total ACLE covered across API: 1 ```
Author: a74nh
Assignees: -
Labels: `area-System.Runtime.Intrinsics`, `untriaged`, `api-ready-for-review`
Milestone: -
a74nh commented 3 weeks ago

Updated to match implemented SVE1 methods.

a74nh commented 3 weeks ago

This feature is not yet available on any existing Arm hardware. I don't recommend implementing this for .NET10