dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.66k stars 4.57k forks source link

[API Proposal]: Arm64: FEAT_SVE: fp #94005

Open a74nh opened 9 months ago

a74nh commented 9 months ago
namespace System.Runtime.Intrinsics.Arm

/// VectorT Summary
public abstract class Sve : AdvSimd /// Feature: FEAT_SVE  Category: fp
{

  /// T: float, double
  public static unsafe Vector<T> AddRotateComplex(Vector<T> op1, Vector<T> op2, ulong imm_rotation); // FCADD // predicated, MOVPRFX

  /// T: float, double
  public static unsafe T AddSequentialAcross(T initial, Vector<T> op); // FADDA // predicated

  /// T: [double, float], [double, int], [double, long], [double, uint], [double, ulong]
  public static unsafe Vector<T> ConvertToDouble(Vector<T2> value); // FCVT or SCVTF or UCVTF // predicated, MOVPRFX

  /// T: [int, float], [int, double]
  public static unsafe Vector<T> ConvertToInt32(Vector<T2> value); // FCVTZS // predicated, MOVPRFX

  /// T: [long, float], [long, double]
  public static unsafe Vector<T> ConvertToInt64(Vector<T2> value); // FCVTZS // predicated, MOVPRFX

  /// T: [float, double], [float, int], [float, long], [float, uint], [float, ulong]
  public static unsafe Vector<T> ConvertToSingle(Vector<T2> value); // FCVT or SCVTF or UCVTF // predicated, MOVPRFX

  /// T: [uint, float], [uint, double]
  public static unsafe Vector<T> ConvertToUInt32(Vector<T2> value); // FCVTZU // predicated, MOVPRFX

  /// T: [ulong, float], [ulong, double]
  public static unsafe Vector<T> ConvertToUInt64(Vector<T2> value); // FCVTZU // predicated, MOVPRFX

  /// T: [float, uint], [double, ulong]
  public static unsafe Vector<T> FloatingPointExponentialAccelerator(Vector<T2> value); // FEXPA

  /// T: float, double
  public static unsafe Vector<T> MultiplyAddRotateComplex(Vector<T> op1, Vector<T> op2, Vector<T> op3, ulong imm_rotation); // FCMLA // predicated, MOVPRFX

  public static unsafe Vector<float> MultiplyAddRotateComplex(Vector<float> op1, Vector<float> op2, Vector<float> op3, ulong imm_index, ulong imm_rotation); // FCMLA // MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> ReciprocalEstimate(Vector<T> value); // FRECPE

  /// T: float, double
  public static unsafe Vector<T> ReciprocalExponent(Vector<T> value); // FRECPX // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> ReciprocalSqrtEstimate(Vector<T> value); // FRSQRTE

  /// T: float, double
  public static unsafe Vector<T> ReciprocalSqrtStep(Vector<T> left, Vector<T> right); // FRSQRTS

  /// T: float, double
  public static unsafe Vector<T> ReciprocalStep(Vector<T> left, Vector<T> right); // FRECPS

  /// T: float, double
  public static unsafe Vector<T> RoundAwayFromZero(Vector<T> value); // FRINTA // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> RoundToNearest(Vector<T> value); // FRINTN // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> RoundToNegativeInfinity(Vector<T> value); // FRINTM // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> RoundToPositiveInfinity(Vector<T> value); // FRINTP // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> RoundToZero(Vector<T> value); // FRINTZ // predicated, MOVPRFX

  /// T: [float, int], [double, long]
  public static unsafe Vector<T> Scale(Vector<T> left, Vector<T2> right); // FSCALE // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> Sqrt(Vector<T> value); // FSQRT // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> TrigonometricMultiplyAddCoefficient(Vector<T> op1, Vector<T> op2, ulong imm3); // FTMAD // MOVPRFX

  /// T: [float, uint], [double, ulong]
  public static unsafe Vector<T> TrigonometricSelectCoefficient(Vector<T> left, Vector<T2> right); // FTSSEL

  /// T: [float, uint], [double, ulong]
  public static unsafe Vector<T> TrigonometricStartingValue(Vector<T> left, Vector<T2> right); // FTSMUL

  /// total method signatures: 26

  /// Optional Entries:

  public static unsafe Vector<float> Scale(Vector<float> left, int right); // FSCALE // predicated, MOVPRFX

  public static unsafe Vector<double> Scale(Vector<double> left, long right); // FSCALE // predicated, MOVPRFX

  /// total optional method signatures: 2

}

Details

TrigonometricMultiplyAddCoefficient Floating-point trigonometric multiply-add coefficient

Calculates the series terms for either sin(x) or cos(x), where the argument x has been adjusted to be in the range -π/4 < x ≤ π/4.

To calculate the series terms of sin(x) and cos(x) the initial source operands should be zero in the first source vector and x2 in the second source vector. The operation is then executed eight times to calculate the sum of eight series terms, which gives a result of sufficient precision.

The method multiplies each element of the first source vector by the absolute value of the corresponding element of the second source vector and performs a fused addition of each product with a value obtained from a table of hard-wired coefficients, and places the results destructively in the first source vector.

The coefficients are different for sin(x) and cos(x), and are selected by a combination of the sign bit in the second source element and an immediate index in the range 0 to 7.

See https://docsmirror.github.io/A64/2023-06/ftmad_z_zzi.html for the full coefficient tables.

TrigonometricSelectCoefficient Floating-point trigonometric select coefficient

Selects the coefficient for the final multiplication in the polynomial series approximation. The instruction places the value 1.0 or a copy of the first source vector element in the destination element, depending on bit 0 of the quadrant number q held in the corresponding element of the second source vector. The sign bit of the destination element is copied from bit 1 of the corresponding value of q.

To compute sin(x) or cos(x) the instruction is executed with elements of the first source vector set to x, adjusted to be in the range -π/4 < x ≤ π/4.

The elements of the second source vector hold the corresponding value of the quadrant q number as an integer not a floating-point value. The value q satisfies the relationship (2q-1) × π/4 < x ≤ (2q+1) × π/4.

TrigonometricStartingValue Floating-point trigonometric starting value

Calculates the initial value for TrigonometricMultiplyAddCoefficient. The method squares each element in the first source vector and then sets the sign bit to a copy of bit 0 of the corresponding element in the second source register, and places the results in the destination vector.

To compute sin(x) or cos(x) the instruction is executed with elements of the first source vector set to x, adjusted to be in the range -π/4 < x ≤ π/4.

The elements of the second source vector hold the corresponding value of the quadrant q number as an integer not a floating-point value. The value q satisfies the relationship (2q-1) × π/4 < x ≤ (2q+1) × π/4.

ghost commented 9 months ago

Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.

Issue Details
```csharp namespace System.Runtime.Intrinsics.Arm /// VectorT Summary public abstract class Sve : AdvSimd /// Feature: FEAT_SVE Category: fp { /// T: float, double public static unsafe T AddOrderedReduce(T initial, Vector op); // FADDA /// T: [float, int], [double, long] public static unsafe Vector AdjustExponent(Vector left, Vector right); // FSCALE (MOVPRFX) /// T: float, double public static unsafe Vector ComplexAddRotate(Vector op1, Vector op2, ulong imm_rotation); // FCADD (MOVPRFX) /// T: float, double public static unsafe Vector ComplexMultiplyAddRotate(Vector op1, Vector op2, Vector op3, ulong imm_rotation); // FCMLA (MOVPRFX) public static unsafe Vector ComplexMultiplyAddRotate(Vector op1, Vector op2, Vector op3, ulong imm_index, ulong imm_rotation); /// T: [float, double], [double, float], [int, float], [int, double], [long, float], [long, double], [uint, float], [uint, double], [ulong, float], [ulong, double], [float, int], [float, long], [float, uint], [float, ulong], [double, int], [double, long], [double, uint], [double, ulong] public static unsafe Vector FloatingPointConvert(Vector value); // FCVT or FCVTZS or FCVTZU or SCVTF or UCVTF (MOVPRFX) /// T: [float, uint], [double, ulong] public static unsafe Vector FloatingPointExponentialAccelerator(Vector value); // FEXPA /// T: float, double public static unsafe Vector ReciprocalEstimate(Vector value); // FRECPE /// T: float, double public static unsafe Vector ReciprocalExponent(Vector value); // FRECPX (MOVPRFX) /// T: float, double public static unsafe Vector ReciprocalSquareRootEstimate(Vector value); // FRSQRTE /// T: float, double public static unsafe Vector ReciprocalSquareRootStep(Vector left, Vector right); // FRSQRTS /// T: float, double public static unsafe Vector ReciprocalStep(Vector left, Vector right); // FRECPS /// T: float, double public static unsafe Vector RoundToNearestTiesAwayFromZero(Vector value); // FRINTA (MOVPRFX) /// T: float, double public static unsafe Vector RoundToNearestTiesToEven(Vector value); // FRINTN (MOVPRFX) /// T: float, double public static unsafe Vector RoundTowardsWithMergeTowardMinusInfinity(Vector value); // FRINTM (MOVPRFX) /// T: float, double public static unsafe Vector RoundTowardsWithMergeTowardPlusInfinity(Vector value); // FRINTP (MOVPRFX) /// T: float, double public static unsafe Vector RoundTowardsZero(Vector value); // FRINTZ (MOVPRFX) /// T: float, double public static unsafe Vector RoundUsingCurrentRoundingModeExact(Vector value); // FRINTX (MOVPRFX) /// T: float, double public static unsafe Vector RoundUsingCurrentRoundingModeInexact(Vector value); // FRINTI (MOVPRFX) /// T: float, double public static unsafe Vector SquareRoot(Vector value); // FSQRT (MOVPRFX) /// T: float, double public static unsafe Vector TrigonometricMultiplyAddCoefficient(Vector op1, Vector op2, ulong imm3); // FTMAD (MOVPRFX) /// T: [float, uint], [double, ulong] public static unsafe Vector TrigonometricSelectCoefficient(Vector left, Vector right); // FTSSEL /// T: [float, uint], [double, ulong] public static unsafe Vector TrigonometricStartingValue(Vector left, Vector right); // FTSMUL /// total method signatures: 23 } ```
Author: a74nh
Assignees: -
Labels: `area-System.Numerics`
Milestone: -
a74nh commented 9 months ago
/// Full API
public abstract class Sve : AdvSimd /// Feature: FEAT_SVE  Category: fp
{
    /// AddRotateComplex : Complex add with rotate

    /// svfloat32_t svcadd[_f32]_m(svbool_t pg, svfloat32_t op1, svfloat32_t op2, uint64_t imm_rotation) : "FCADD Ztied1.S, Pg/M, Ztied1.S, Zop2.S, #imm_rotation" or "MOVPRFX Zresult, Zop1; FCADD Zresult.S, Pg/M, Zresult.S, Zop2.S, #imm_rotation"
    /// svfloat32_t svcadd[_f32]_x(svbool_t pg, svfloat32_t op1, svfloat32_t op2, uint64_t imm_rotation) : "FCADD Ztied1.S, Pg/M, Ztied1.S, Zop2.S, #imm_rotation" or "MOVPRFX Zresult, Zop1; FCADD Zresult.S, Pg/M, Zresult.S, Zop2.S, #imm_rotation"
    /// svfloat32_t svcadd[_f32]_z(svbool_t pg, svfloat32_t op1, svfloat32_t op2, uint64_t imm_rotation) : "MOVPRFX Zresult.S, Pg/Z, Zop1.S; FCADD Zresult.S, Pg/M, Zresult.S, Zop2.S, #imm_rotation"
  public static unsafe Vector<float> AddRotateComplex(Vector<float> op1, Vector<float> op2, ulong imm_rotation);

    /// svfloat64_t svcadd[_f64]_m(svbool_t pg, svfloat64_t op1, svfloat64_t op2, uint64_t imm_rotation) : "FCADD Ztied1.D, Pg/M, Ztied1.D, Zop2.D, #imm_rotation" or "MOVPRFX Zresult, Zop1; FCADD Zresult.D, Pg/M, Zresult.D, Zop2.D, #imm_rotation"
    /// svfloat64_t svcadd[_f64]_x(svbool_t pg, svfloat64_t op1, svfloat64_t op2, uint64_t imm_rotation) : "FCADD Ztied1.D, Pg/M, Ztied1.D, Zop2.D, #imm_rotation" or "MOVPRFX Zresult, Zop1; FCADD Zresult.D, Pg/M, Zresult.D, Zop2.D, #imm_rotation"
    /// svfloat64_t svcadd[_f64]_z(svbool_t pg, svfloat64_t op1, svfloat64_t op2, uint64_t imm_rotation) : "MOVPRFX Zresult.D, Pg/Z, Zop1.D; FCADD Zresult.D, Pg/M, Zresult.D, Zop2.D, #imm_rotation"
  public static unsafe Vector<double> AddRotateComplex(Vector<double> op1, Vector<double> op2, ulong imm_rotation);

    /// AddSequentialAcross : Add reduction (strictly-ordered)

    /// float32_t svadda[_f32](svbool_t pg, float32_t initial, svfloat32_t op) : "FADDA Stied, Pg, Stied, Zop.S"
  public static unsafe float AddSequentialAcross(float initial, Vector<float> op);

    /// float64_t svadda[_f64](svbool_t pg, float64_t initial, svfloat64_t op) : "FADDA Dtied, Pg, Dtied, Zop.D"
  public static unsafe double AddSequentialAcross(double initial, Vector<double> op);

    /// ConvertToDouble : Floating-point convert

    /// svfloat64_t svcvt_f64[_f32]_m(svfloat64_t inactive, svbool_t pg, svfloat32_t op) : "FCVT Ztied.D, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FCVT Zresult.D, Pg/M, Zop.S"
    /// svfloat64_t svcvt_f64[_f32]_x(svbool_t pg, svfloat32_t op) : "FCVT Ztied.D, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FCVT Zresult.D, Pg/M, Zop.S"
    /// svfloat64_t svcvt_f64[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FCVT Zresult.D, Pg/M, Zop.S"
  public static unsafe Vector<double> ConvertToDouble(Vector<float> value);

    /// svfloat64_t svcvt_f64[_s32]_m(svfloat64_t inactive, svbool_t pg, svint32_t op) : "SCVTF Ztied.D, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; SCVTF Zresult.D, Pg/M, Zop.S"
    /// svfloat64_t svcvt_f64[_s32]_x(svbool_t pg, svint32_t op) : "SCVTF Ztied.D, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; SCVTF Zresult.D, Pg/M, Zop.S"
    /// svfloat64_t svcvt_f64[_s32]_z(svbool_t pg, svint32_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; SCVTF Zresult.D, Pg/M, Zop.S"
  public static unsafe Vector<double> ConvertToDouble(Vector<int> value);

    /// svfloat64_t svcvt_f64[_s64]_m(svfloat64_t inactive, svbool_t pg, svint64_t op) : "SCVTF Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; SCVTF Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svcvt_f64[_s64]_x(svbool_t pg, svint64_t op) : "SCVTF Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; SCVTF Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svcvt_f64[_s64]_z(svbool_t pg, svint64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; SCVTF Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<double> ConvertToDouble(Vector<long> value);

    /// svfloat64_t svcvt_f64[_u32]_m(svfloat64_t inactive, svbool_t pg, svuint32_t op) : "UCVTF Ztied.D, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; UCVTF Zresult.D, Pg/M, Zop.S"
    /// svfloat64_t svcvt_f64[_u32]_x(svbool_t pg, svuint32_t op) : "UCVTF Ztied.D, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; UCVTF Zresult.D, Pg/M, Zop.S"
    /// svfloat64_t svcvt_f64[_u32]_z(svbool_t pg, svuint32_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; UCVTF Zresult.D, Pg/M, Zop.S"
  public static unsafe Vector<double> ConvertToDouble(Vector<uint> value);

    /// svfloat64_t svcvt_f64[_u64]_m(svfloat64_t inactive, svbool_t pg, svuint64_t op) : "UCVTF Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; UCVTF Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svcvt_f64[_u64]_x(svbool_t pg, svuint64_t op) : "UCVTF Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; UCVTF Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svcvt_f64[_u64]_z(svbool_t pg, svuint64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; UCVTF Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<double> ConvertToDouble(Vector<ulong> value);

    /// ConvertToInt32 : Floating-point convert

    /// svint32_t svcvt_s32[_f32]_m(svint32_t inactive, svbool_t pg, svfloat32_t op) : "FCVTZS Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FCVTZS Zresult.S, Pg/M, Zop.S"
    /// svint32_t svcvt_s32[_f32]_x(svbool_t pg, svfloat32_t op) : "FCVTZS Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FCVTZS Zresult.S, Pg/M, Zop.S"
    /// svint32_t svcvt_s32[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; FCVTZS Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<int> ConvertToInt32(Vector<float> value);

    /// svint32_t svcvt_s32[_f64]_m(svint32_t inactive, svbool_t pg, svfloat64_t op) : "FCVTZS Ztied.S, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FCVTZS Zresult.S, Pg/M, Zop.D"
    /// svint32_t svcvt_s32[_f64]_x(svbool_t pg, svfloat64_t op) : "FCVTZS Ztied.S, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FCVTZS Zresult.S, Pg/M, Zop.D"
    /// svint32_t svcvt_s32[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FCVTZS Zresult.S, Pg/M, Zop.D"
  public static unsafe Vector<int> ConvertToInt32(Vector<double> value);

    /// ConvertToInt64 : Floating-point convert

    /// svint64_t svcvt_s64[_f32]_m(svint64_t inactive, svbool_t pg, svfloat32_t op) : "FCVTZS Ztied.D, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FCVTZS Zresult.D, Pg/M, Zop.S"
    /// svint64_t svcvt_s64[_f32]_x(svbool_t pg, svfloat32_t op) : "FCVTZS Ztied.D, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FCVTZS Zresult.D, Pg/M, Zop.S"
    /// svint64_t svcvt_s64[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FCVTZS Zresult.D, Pg/M, Zop.S"
  public static unsafe Vector<long> ConvertToInt64(Vector<float> value);

    /// svint64_t svcvt_s64[_f64]_m(svint64_t inactive, svbool_t pg, svfloat64_t op) : "FCVTZS Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FCVTZS Zresult.D, Pg/M, Zop.D"
    /// svint64_t svcvt_s64[_f64]_x(svbool_t pg, svfloat64_t op) : "FCVTZS Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FCVTZS Zresult.D, Pg/M, Zop.D"
    /// svint64_t svcvt_s64[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FCVTZS Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<long> ConvertToInt64(Vector<double> value);

    /// ConvertToSingle : Floating-point convert

    /// svfloat32_t svcvt_f32[_f64]_m(svfloat32_t inactive, svbool_t pg, svfloat64_t op) : "FCVT Ztied.S, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FCVT Zresult.S, Pg/M, Zop.D"
    /// svfloat32_t svcvt_f32[_f64]_x(svbool_t pg, svfloat64_t op) : "FCVT Ztied.S, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FCVT Zresult.S, Pg/M, Zop.D"
    /// svfloat32_t svcvt_f32[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FCVT Zresult.S, Pg/M, Zop.D"
  public static unsafe Vector<float> ConvertToSingle(Vector<double> value);

    /// svfloat32_t svcvt_f32[_s32]_m(svfloat32_t inactive, svbool_t pg, svint32_t op) : "SCVTF Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; SCVTF Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svcvt_f32[_s32]_x(svbool_t pg, svint32_t op) : "SCVTF Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; SCVTF Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svcvt_f32[_s32]_z(svbool_t pg, svint32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; SCVTF Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<float> ConvertToSingle(Vector<int> value);

    /// svfloat32_t svcvt_f32[_s64]_m(svfloat32_t inactive, svbool_t pg, svint64_t op) : "SCVTF Ztied.S, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; SCVTF Zresult.S, Pg/M, Zop.D"
    /// svfloat32_t svcvt_f32[_s64]_x(svbool_t pg, svint64_t op) : "SCVTF Ztied.S, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; SCVTF Zresult.S, Pg/M, Zop.D"
    /// svfloat32_t svcvt_f32[_s64]_z(svbool_t pg, svint64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; SCVTF Zresult.S, Pg/M, Zop.D"
  public static unsafe Vector<float> ConvertToSingle(Vector<long> value);

    /// svfloat32_t svcvt_f32[_u32]_m(svfloat32_t inactive, svbool_t pg, svuint32_t op) : "UCVTF Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; UCVTF Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svcvt_f32[_u32]_x(svbool_t pg, svuint32_t op) : "UCVTF Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; UCVTF Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svcvt_f32[_u32]_z(svbool_t pg, svuint32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; UCVTF Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<float> ConvertToSingle(Vector<uint> value);

    /// svfloat32_t svcvt_f32[_u64]_m(svfloat32_t inactive, svbool_t pg, svuint64_t op) : "UCVTF Ztied.S, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; UCVTF Zresult.S, Pg/M, Zop.D"
    /// svfloat32_t svcvt_f32[_u64]_x(svbool_t pg, svuint64_t op) : "UCVTF Ztied.S, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; UCVTF Zresult.S, Pg/M, Zop.D"
    /// svfloat32_t svcvt_f32[_u64]_z(svbool_t pg, svuint64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; UCVTF Zresult.S, Pg/M, Zop.D"
  public static unsafe Vector<float> ConvertToSingle(Vector<ulong> value);

    /// ConvertToUInt32 : Floating-point convert

    /// svuint32_t svcvt_u32[_f32]_m(svuint32_t inactive, svbool_t pg, svfloat32_t op) : "FCVTZU Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FCVTZU Zresult.S, Pg/M, Zop.S"
    /// svuint32_t svcvt_u32[_f32]_x(svbool_t pg, svfloat32_t op) : "FCVTZU Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FCVTZU Zresult.S, Pg/M, Zop.S"
    /// svuint32_t svcvt_u32[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; FCVTZU Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<uint> ConvertToUInt32(Vector<float> value);

    /// svuint32_t svcvt_u32[_f64]_m(svuint32_t inactive, svbool_t pg, svfloat64_t op) : "FCVTZU Ztied.S, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FCVTZU Zresult.S, Pg/M, Zop.D"
    /// svuint32_t svcvt_u32[_f64]_x(svbool_t pg, svfloat64_t op) : "FCVTZU Ztied.S, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FCVTZU Zresult.S, Pg/M, Zop.D"
    /// svuint32_t svcvt_u32[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FCVTZU Zresult.S, Pg/M, Zop.D"
  public static unsafe Vector<uint> ConvertToUInt32(Vector<double> value);

    /// ConvertToUInt64 : Floating-point convert

    /// svuint64_t svcvt_u64[_f32]_m(svuint64_t inactive, svbool_t pg, svfloat32_t op) : "FCVTZU Ztied.D, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FCVTZU Zresult.D, Pg/M, Zop.S"
    /// svuint64_t svcvt_u64[_f32]_x(svbool_t pg, svfloat32_t op) : "FCVTZU Ztied.D, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FCVTZU Zresult.D, Pg/M, Zop.S"
    /// svuint64_t svcvt_u64[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FCVTZU Zresult.D, Pg/M, Zop.S"
  public static unsafe Vector<ulong> ConvertToUInt64(Vector<float> value);

    /// svuint64_t svcvt_u64[_f64]_m(svuint64_t inactive, svbool_t pg, svfloat64_t op) : "FCVTZU Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FCVTZU Zresult.D, Pg/M, Zop.D"
    /// svuint64_t svcvt_u64[_f64]_x(svbool_t pg, svfloat64_t op) : "FCVTZU Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FCVTZU Zresult.D, Pg/M, Zop.D"
    /// svuint64_t svcvt_u64[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FCVTZU Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<ulong> ConvertToUInt64(Vector<double> value);

    /// FloatingPointExponentialAccelerator : Floating-point exponential accelerator

    /// svfloat32_t svexpa[_f32](svuint32_t op) : "FEXPA Zresult.S, Zop.S"
  public static unsafe Vector<float> FloatingPointExponentialAccelerator(Vector<uint> value);

    /// svfloat64_t svexpa[_f64](svuint64_t op) : "FEXPA Zresult.D, Zop.D"
  public static unsafe Vector<double> FloatingPointExponentialAccelerator(Vector<ulong> value);

    /// MultiplyAddRotateComplex : Complex multiply-add with rotate

    /// svfloat32_t svcmla[_f32]_m(svbool_t pg, svfloat32_t op1, svfloat32_t op2, svfloat32_t op3, uint64_t imm_rotation) : "FCMLA Ztied1.S, Pg/M, Zop2.S, Zop3.S, #imm_rotation" or "MOVPRFX Zresult, Zop1; FCMLA Zresult.S, Pg/M, Zop2.S, Zop3.S, #imm_rotation"
    /// svfloat32_t svcmla[_f32]_x(svbool_t pg, svfloat32_t op1, svfloat32_t op2, svfloat32_t op3, uint64_t imm_rotation) : "FCMLA Ztied1.S, Pg/M, Zop2.S, Zop3.S, #imm_rotation" or "MOVPRFX Zresult, Zop1; FCMLA Zresult.S, Pg/M, Zop2.S, Zop3.S, #imm_rotation"
    /// svfloat32_t svcmla[_f32]_z(svbool_t pg, svfloat32_t op1, svfloat32_t op2, svfloat32_t op3, uint64_t imm_rotation) : "MOVPRFX Zresult.S, Pg/Z, Zop1.S; FCMLA Zresult.S, Pg/M, Zop2.S, Zop3.S, #imm_rotation"
  public static unsafe Vector<float> MultiplyAddRotateComplex(Vector<float> op1, Vector<float> op2, Vector<float> op3, ulong imm_rotation);

    /// svfloat64_t svcmla[_f64]_m(svbool_t pg, svfloat64_t op1, svfloat64_t op2, svfloat64_t op3, uint64_t imm_rotation) : "FCMLA Ztied1.D, Pg/M, Zop2.D, Zop3.D, #imm_rotation" or "MOVPRFX Zresult, Zop1; FCMLA Zresult.D, Pg/M, Zop2.D, Zop3.D, #imm_rotation"
    /// svfloat64_t svcmla[_f64]_x(svbool_t pg, svfloat64_t op1, svfloat64_t op2, svfloat64_t op3, uint64_t imm_rotation) : "FCMLA Ztied1.D, Pg/M, Zop2.D, Zop3.D, #imm_rotation" or "MOVPRFX Zresult, Zop1; FCMLA Zresult.D, Pg/M, Zop2.D, Zop3.D, #imm_rotation"
    /// svfloat64_t svcmla[_f64]_z(svbool_t pg, svfloat64_t op1, svfloat64_t op2, svfloat64_t op3, uint64_t imm_rotation) : "MOVPRFX Zresult.D, Pg/Z, Zop1.D; FCMLA Zresult.D, Pg/M, Zop2.D, Zop3.D, #imm_rotation"
  public static unsafe Vector<double> MultiplyAddRotateComplex(Vector<double> op1, Vector<double> op2, Vector<double> op3, ulong imm_rotation);

    /// svfloat32_t svcmla_lane[_f32](svfloat32_t op1, svfloat32_t op2, svfloat32_t op3, uint64_t imm_index, uint64_t imm_rotation) : "FCMLA Ztied1.S, Zop2.S, Zop3.S[imm_index], #imm_rotation" or "MOVPRFX Zresult, Zop1; FCMLA Zresult.S, Zop2.S, Zop3.S[imm_index], #imm_rotation"
  public static unsafe Vector<float> MultiplyAddRotateComplex(Vector<float> op1, Vector<float> op2, Vector<float> op3, ulong imm_index, ulong imm_rotation);

    /// ReciprocalEstimate : Reciprocal estimate

    /// svfloat32_t svrecpe[_f32](svfloat32_t op) : "FRECPE Zresult.S, Zop.S"
  public static unsafe Vector<float> ReciprocalEstimate(Vector<float> value);

    /// svfloat64_t svrecpe[_f64](svfloat64_t op) : "FRECPE Zresult.D, Zop.D"
  public static unsafe Vector<double> ReciprocalEstimate(Vector<double> value);

    /// ReciprocalExponent : Reciprocal exponent

    /// svfloat32_t svrecpx[_f32]_m(svfloat32_t inactive, svbool_t pg, svfloat32_t op) : "FRECPX Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FRECPX Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrecpx[_f32]_x(svbool_t pg, svfloat32_t op) : "FRECPX Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FRECPX Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrecpx[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; FRECPX Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<float> ReciprocalExponent(Vector<float> value);

    /// svfloat64_t svrecpx[_f64]_m(svfloat64_t inactive, svbool_t pg, svfloat64_t op) : "FRECPX Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FRECPX Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrecpx[_f64]_x(svbool_t pg, svfloat64_t op) : "FRECPX Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FRECPX Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrecpx[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FRECPX Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<double> ReciprocalExponent(Vector<double> value);

    /// ReciprocalSqrtEstimate : Reciprocal square root estimate

    /// svfloat32_t svrsqrte[_f32](svfloat32_t op) : "FRSQRTE Zresult.S, Zop.S"
  public static unsafe Vector<float> ReciprocalSqrtEstimate(Vector<float> value);

    /// svfloat64_t svrsqrte[_f64](svfloat64_t op) : "FRSQRTE Zresult.D, Zop.D"
  public static unsafe Vector<double> ReciprocalSqrtEstimate(Vector<double> value);

    /// ReciprocalSqrtStep : Reciprocal square root step

    /// svfloat32_t svrsqrts[_f32](svfloat32_t op1, svfloat32_t op2) : "FRSQRTS Zresult.S, Zop1.S, Zop2.S"
  public static unsafe Vector<float> ReciprocalSqrtStep(Vector<float> left, Vector<float> right);

    /// svfloat64_t svrsqrts[_f64](svfloat64_t op1, svfloat64_t op2) : "FRSQRTS Zresult.D, Zop1.D, Zop2.D"
  public static unsafe Vector<double> ReciprocalSqrtStep(Vector<double> left, Vector<double> right);

    /// ReciprocalStep : Reciprocal step

    /// svfloat32_t svrecps[_f32](svfloat32_t op1, svfloat32_t op2) : "FRECPS Zresult.S, Zop1.S, Zop2.S"
  public static unsafe Vector<float> ReciprocalStep(Vector<float> left, Vector<float> right);

    /// svfloat64_t svrecps[_f64](svfloat64_t op1, svfloat64_t op2) : "FRECPS Zresult.D, Zop1.D, Zop2.D"
  public static unsafe Vector<double> ReciprocalStep(Vector<double> left, Vector<double> right);

    /// RoundAwayFromZero : Round to nearest, ties away from zero

    /// svfloat32_t svrinta[_f32]_m(svfloat32_t inactive, svbool_t pg, svfloat32_t op) : "FRINTA Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FRINTA Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrinta[_f32]_x(svbool_t pg, svfloat32_t op) : "FRINTA Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FRINTA Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrinta[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; FRINTA Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<float> RoundAwayFromZero(Vector<float> value);

    /// svfloat64_t svrinta[_f64]_m(svfloat64_t inactive, svbool_t pg, svfloat64_t op) : "FRINTA Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FRINTA Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrinta[_f64]_x(svbool_t pg, svfloat64_t op) : "FRINTA Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FRINTA Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrinta[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FRINTA Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<double> RoundAwayFromZero(Vector<double> value);

    /// RoundToNearest : Round to nearest, ties to even

    /// svfloat32_t svrintn[_f32]_m(svfloat32_t inactive, svbool_t pg, svfloat32_t op) : "FRINTN Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FRINTN Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrintn[_f32]_x(svbool_t pg, svfloat32_t op) : "FRINTN Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FRINTN Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrintn[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; FRINTN Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<float> RoundToNearest(Vector<float> value);

    /// svfloat64_t svrintn[_f64]_m(svfloat64_t inactive, svbool_t pg, svfloat64_t op) : "FRINTN Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FRINTN Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrintn[_f64]_x(svbool_t pg, svfloat64_t op) : "FRINTN Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FRINTN Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrintn[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FRINTN Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<double> RoundToNearest(Vector<double> value);

    /// RoundToNegativeInfinity : Round towards -∞

    /// svfloat32_t svrintm[_f32]_m(svfloat32_t inactive, svbool_t pg, svfloat32_t op) : "FRINTM Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FRINTM Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrintm[_f32]_x(svbool_t pg, svfloat32_t op) : "FRINTM Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FRINTM Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrintm[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; FRINTM Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<float> RoundToNegativeInfinity(Vector<float> value);

    /// svfloat64_t svrintm[_f64]_m(svfloat64_t inactive, svbool_t pg, svfloat64_t op) : "FRINTM Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FRINTM Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrintm[_f64]_x(svbool_t pg, svfloat64_t op) : "FRINTM Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FRINTM Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrintm[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FRINTM Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<double> RoundToNegativeInfinity(Vector<double> value);

    /// RoundToPositiveInfinity : Round towards +∞

    /// svfloat32_t svrintp[_f32]_m(svfloat32_t inactive, svbool_t pg, svfloat32_t op) : "FRINTP Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FRINTP Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrintp[_f32]_x(svbool_t pg, svfloat32_t op) : "FRINTP Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FRINTP Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrintp[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; FRINTP Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<float> RoundToPositiveInfinity(Vector<float> value);

    /// svfloat64_t svrintp[_f64]_m(svfloat64_t inactive, svbool_t pg, svfloat64_t op) : "FRINTP Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FRINTP Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrintp[_f64]_x(svbool_t pg, svfloat64_t op) : "FRINTP Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FRINTP Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrintp[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FRINTP Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<double> RoundToPositiveInfinity(Vector<double> value);

    /// RoundToZero : Round towards zero

    /// svfloat32_t svrintz[_f32]_m(svfloat32_t inactive, svbool_t pg, svfloat32_t op) : "FRINTZ Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FRINTZ Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrintz[_f32]_x(svbool_t pg, svfloat32_t op) : "FRINTZ Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FRINTZ Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svrintz[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; FRINTZ Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<float> RoundToZero(Vector<float> value);

    /// svfloat64_t svrintz[_f64]_m(svfloat64_t inactive, svbool_t pg, svfloat64_t op) : "FRINTZ Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FRINTZ Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrintz[_f64]_x(svbool_t pg, svfloat64_t op) : "FRINTZ Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FRINTZ Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svrintz[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FRINTZ Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<double> RoundToZero(Vector<double> value);

    /// Scale : Adjust exponent

    /// svfloat32_t svscale[_f32]_m(svbool_t pg, svfloat32_t op1, svint32_t op2) : "FSCALE Ztied1.S, Pg/M, Ztied1.S, Zop2.S" or "MOVPRFX Zresult, Zop1; FSCALE Zresult.S, Pg/M, Zresult.S, Zop2.S"
    /// svfloat32_t svscale[_f32]_x(svbool_t pg, svfloat32_t op1, svint32_t op2) : "FSCALE Ztied1.S, Pg/M, Ztied1.S, Zop2.S" or "MOVPRFX Zresult, Zop1; FSCALE Zresult.S, Pg/M, Zresult.S, Zop2.S"
    /// svfloat32_t svscale[_f32]_z(svbool_t pg, svfloat32_t op1, svint32_t op2) : "MOVPRFX Zresult.S, Pg/Z, Zop1.S; FSCALE Zresult.S, Pg/M, Zresult.S, Zop2.S"
  public static unsafe Vector<float> Scale(Vector<float> left, Vector<int> right);

    /// svfloat64_t svscale[_f64]_m(svbool_t pg, svfloat64_t op1, svint64_t op2) : "FSCALE Ztied1.D, Pg/M, Ztied1.D, Zop2.D" or "MOVPRFX Zresult, Zop1; FSCALE Zresult.D, Pg/M, Zresult.D, Zop2.D"
    /// svfloat64_t svscale[_f64]_x(svbool_t pg, svfloat64_t op1, svint64_t op2) : "FSCALE Ztied1.D, Pg/M, Ztied1.D, Zop2.D" or "MOVPRFX Zresult, Zop1; FSCALE Zresult.D, Pg/M, Zresult.D, Zop2.D"
    /// svfloat64_t svscale[_f64]_z(svbool_t pg, svfloat64_t op1, svint64_t op2) : "MOVPRFX Zresult.D, Pg/Z, Zop1.D; FSCALE Zresult.D, Pg/M, Zresult.D, Zop2.D"
  public static unsafe Vector<double> Scale(Vector<double> left, Vector<long> right);

    /// Sqrt : Square root

    /// svfloat32_t svsqrt[_f32]_m(svfloat32_t inactive, svbool_t pg, svfloat32_t op) : "FSQRT Ztied.S, Pg/M, Zop.S" or "MOVPRFX Zresult, Zinactive; FSQRT Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svsqrt[_f32]_x(svbool_t pg, svfloat32_t op) : "FSQRT Ztied.S, Pg/M, Ztied.S" or "MOVPRFX Zresult, Zop; FSQRT Zresult.S, Pg/M, Zop.S"
    /// svfloat32_t svsqrt[_f32]_z(svbool_t pg, svfloat32_t op) : "MOVPRFX Zresult.S, Pg/Z, Zop.S; FSQRT Zresult.S, Pg/M, Zop.S"
  public static unsafe Vector<float> Sqrt(Vector<float> value);

    /// svfloat64_t svsqrt[_f64]_m(svfloat64_t inactive, svbool_t pg, svfloat64_t op) : "FSQRT Ztied.D, Pg/M, Zop.D" or "MOVPRFX Zresult, Zinactive; FSQRT Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svsqrt[_f64]_x(svbool_t pg, svfloat64_t op) : "FSQRT Ztied.D, Pg/M, Ztied.D" or "MOVPRFX Zresult, Zop; FSQRT Zresult.D, Pg/M, Zop.D"
    /// svfloat64_t svsqrt[_f64]_z(svbool_t pg, svfloat64_t op) : "MOVPRFX Zresult.D, Pg/Z, Zop.D; FSQRT Zresult.D, Pg/M, Zop.D"
  public static unsafe Vector<double> Sqrt(Vector<double> value);

    /// TrigonometricMultiplyAddCoefficient : Trigonometric multiply-add coefficient

    /// svfloat32_t svtmad[_f32](svfloat32_t op1, svfloat32_t op2, uint64_t imm3) : "FTMAD Ztied1.S, Ztied1.S, Zop2.S, #imm3" or "MOVPRFX Zresult, Zop1; FTMAD Zresult.S, Zresult.S, Zop2.S, #imm3"
  public static unsafe Vector<float> TrigonometricMultiplyAddCoefficient(Vector<float> op1, Vector<float> op2, ulong imm3);

    /// svfloat64_t svtmad[_f64](svfloat64_t op1, svfloat64_t op2, uint64_t imm3) : "FTMAD Ztied1.D, Ztied1.D, Zop2.D, #imm3" or "MOVPRFX Zresult, Zop1; FTMAD Zresult.D, Zresult.D, Zop2.D, #imm3"
  public static unsafe Vector<double> TrigonometricMultiplyAddCoefficient(Vector<double> op1, Vector<double> op2, ulong imm3);

    /// TrigonometricSelectCoefficient : Trigonometric select coefficient

    /// svfloat32_t svtssel[_f32](svfloat32_t op1, svuint32_t op2) : "FTSSEL Zresult.S, Zop1.S, Zop2.S"
  public static unsafe Vector<float> TrigonometricSelectCoefficient(Vector<float> left, Vector<uint> right);

    /// svfloat64_t svtssel[_f64](svfloat64_t op1, svuint64_t op2) : "FTSSEL Zresult.D, Zop1.D, Zop2.D"
  public static unsafe Vector<double> TrigonometricSelectCoefficient(Vector<double> left, Vector<ulong> right);

    /// TrigonometricStartingValue : Trigonometric starting value

    /// svfloat32_t svtsmul[_f32](svfloat32_t op1, svuint32_t op2) : "FTSMUL Zresult.S, Zop1.S, Zop2.S"
  public static unsafe Vector<float> TrigonometricStartingValue(Vector<float> left, Vector<uint> right);

    /// svfloat64_t svtsmul[_f64](svfloat64_t op1, svuint64_t op2) : "FTSMUL Zresult.D, Zop1.D, Zop2.D"
  public static unsafe Vector<double> TrigonometricStartingValue(Vector<double> left, Vector<ulong> right);

  /// total method signatures: 57
  /// total method names:      27
}
a74nh commented 9 months ago

  /// Optional Entries:
  ///   public static unsafe Vector<float> Scale(Vector<float> left, int right); // svscale[_n_f32]_m or svscale[_n_f32]_x or svscale[_n_f32]_z
  ///   public static unsafe Vector<double> Scale(Vector<double> left, long right); // svscale[_n_f64]_m or svscale[_n_f64]_x or svscale[_n_f64]_z
  ///   Total Maybe: 2

  /// Rejected:
  ///   public static unsafe Vector<float> RoundUsingCurrentRoundingModeExact(Vector<float> value); // svrintx[_f32]_m or svrintx[_f32]_x or svrintx[_f32]_z
  ///   public static unsafe Vector<double> RoundUsingCurrentRoundingModeExact(Vector<double> value); // svrintx[_f64]_m or svrintx[_f64]_x or svrintx[_f64]_z
  ///   public static unsafe Vector<float> RoundUsingCurrentRoundingModeInexact(Vector<float> value); // svrinti[_f32]_m or svrinti[_f32]_x or svrinti[_f32]_z
  ///   public static unsafe Vector<double> RoundUsingCurrentRoundingModeInexact(Vector<double> value); // svrinti[_f64]_m or svrinti[_f64]_x or svrinti[_f64]_z
  ///   Total Rejected: 4

  /// Total ACLE covered across API:      151
a74nh commented 9 months ago

This contributes to https://github.com/dotnet/runtime/issues/93095

It covers instructions in FEAT_SVE related to floating point operations.

This list was auto generated from the C ACLE for SVE, and is in three parts:

The methods list reduced down to Vector versions. All possible varaints of T are given above the method. The complete list of all methods. The corresponding ACLE methods and SVE instructions are given above the method. All rejected ACLE methods. These are methods we have agreed that do not need including in C#. Where possible, existing C# naming conventions have been matched.

Many of the C functions include predicate argument(s), of type svbool_t as the first argument. These are missing from the C# method. It is expected that the Jit will create predicates where required, or combine with uses of conditionalSelect(). For more discussion see https://github.com/dotnet/runtime/issues/88140 comment.

tannergooding commented 9 months ago

*Reduce should be *Across to match the naming in AdvSimd.

FADDA is just doing e0 + e1 + .. eN rather than doing the (e0 + e1) + (e2 + e3) + ... approach that AddAcross does, correct? I wonder if Sequential or another name might be clearer given that Ordered has a common alternative meaning to IEEE 754 floating-point types.

For AdjustExponent, this is basically a hardware version of ScaleB (which we expose as float.ScaleB). For Avx512F we just called it Scale (since the B, which is 2 for binary floats, is implied). This then matches the hardware instruction name as well.

We probably want to discuss in API review whether ComplexAddRotate or AddRotateComplex is better terminology here. I believe we typically prefer the type at the end.

For FloatingPointConvert the AdvSimd APIs use names like ConvertToInt32RoundAwayFromZero or ConvertToDouble, both so we can disambiguate the return type and so the semantic is clear.

For SquareRoot, we prefer Sqrt to match the name of the primary Math API. Same for ReciprocalSqrtEstimate

For RoundToNearestTiesToEven, we just called it RoundToNearest in AdvSimd as that is the default rounding mode for IEEE 754 floating-points. Similarly we just used names like RoundAwayFromZero, RoundToNegativeInfinity, RoundToPositiveInfinity, and RoundToZero.

We did not expose FRINTX for AdvSimd because .NET doesn't support floating-point exceptions being enabled. We similarly didn't expose FRINTI because we don't support changing the global floating-point rounding mode.

Trigonometric* are new concepts and would likely benefit from a small explanation of what they do and how they benefit APIs like Cos/Sin, etc.

ghost commented 9 months ago

This issue has been marked needs-author-action and may be missing some important information.

a74nh commented 9 months ago

FADDA is just doing e0 + e1 + .. eN rather than doing the (e0 + e1) + (e2 + e3) + ... approach that AddAcross does, correct? I wonder if Sequential or another name might be clearer given that Ordered has a common alternative meaning to IEEE 754 floating-point types.

SADDV/UADDV doesn't specify how the elements are added together.

FADDA is the only version of add across for fp. Maybe for the API the ordered distinction isn't required and it could just be a variant of AddAcross(). Otherwise, I'm happy with sequential.

For AdjustExponent, this is basically a hardware version of ScaleB (which we expose as float.ScaleB). For Avx512F we just called it Scale (since the B, which is 2 for binary floats, is implied). This then matches the hardware instruction name as well.

Done.

We probably want to discuss in API review whether ComplexAddRotate or AddRotateComplex is better terminology here. I believe we typically prefer the type at the end.

Switched for now.

For FloatingPointConvert the AdvSimd APIs use names like ConvertToInt32RoundAwayFromZero or ConvertToDouble, both so we can disambiguate the return type and so the semantic is clear.

Done. Note that the API now shows:

  /// T: [long, float], [long, double]
  public static unsafe Vector<T> ConvertToInt64(Vector<T2> value); // FCVTZS (MOVPRFX)

Ideally it would just be:

  /// T: float, double
  public static unsafe long ConvertToInt64(Vector<T> value); // FCVTZS (MOVPRFX)

But that's scripting limitations I'd rather not fix for now.

Note, I've used ConvertToSignedInt32() etc for the singed variants.

For SquareRoot, we prefer Sqrt to match the name of the primary Math API. Same for ReciprocalSqrtEstimate

AdvSimd has ReciprocalSquareRootEstimate. Changed for SVE.

For RoundToNearestTiesToEven, we just called it RoundToNearest in AdvSimd as that is the default rounding mode for IEEE 754 floating-points. Similarly we just used names like RoundAwayFromZero, RoundToNegativeInfinity, RoundToPositiveInfinity, and RoundToZero.

Done.

We did not expose FRINTX for AdvSimd because .NET doesn't support floating-point exceptions being enabled. We similarly didn't expose FRINTI because we don't support changing the global floating-point rounding mode.

Done.

Trigonometric* are new concepts and would likely benefit from a small explanation of what they do and how they benefit APIs like Cos/Sin, etc.

Done.

a74nh commented 9 months ago

Trigonometric* are new concepts and would likely benefit from a small explanation of what they do and how they benefit APIs like Cos/Sin, etc.

I've included a description, mostly taken from the architecture manual. But, I don't really know much about this group at all, so can't say how/why you would use them. Curiously, the ACLE document avoids a description and just gives a link to the architecture manual.

tannergooding commented 9 months ago

FADDA is the only version of add across for fp

Ah, I was probably misremembering then. We implement the cross platform Sum, for floating-point, using pairwise logic, because it reliably exists across platforms. Naturally the order doesn't matter for integers.

Note, I've used ConvertToSignedInt32() etc for the singed variants.

Int32 (signed) vs UInt32 (unsigned) would be the "preferred" terminology here. We have the following in .NET:

Bitwidth Signed Type Unsigned Type
8 SByte Byte
16 Int16 UInt16
32 Int32 UInt32
64 Int64 UInt64
Ptr NInt NUInt
128 Int128 UInt128
We then for floating-point have (which would necessitate ConvertToFloat becoming ConvertToSingle): Bitwidth Type
16 Half
32 Single
64 Double
a74nh commented 9 months ago

Int32 (signed) vs UInt32 (unsigned) would be the "preferred" terminology here. We have the following in .NET:

Updated to use this.

a74nh commented 8 months ago

Some of these methods could cause a floating point exception. Should a conditionalSelect() with a relevant mask cause the exception to not happen?

tannergooding commented 8 months ago

.NET doesn't support IEEE 754 floating-point exception handling and it is disabled on startup. Enabling it is undefined behavior.

It is highly unlikely we are to support the feature in the future either, but we would discuss how that works at that time if it were to happen.

I imagine that we would treat it a lot like we do LoadAligned on x86, which is that for perf/efficiency reasons and the most common default, we would allow a slight difference in behavior between T0 (throw) and T1 (non throwing).

a74nh commented 8 months ago

.NET doesn't support IEEE 754 floating-point exception handling and it is disabled on startup. Enabling it is undefined behavior.

It is highly unlikely we are to support the feature in the future either, but we would discuss how that works at that time if it were to happen.

I imagine that we would treat it a lot like we do LoadAligned on x86, which is that for perf/efficiency reasons and the most common default, we would allow a slight difference in behavior between T0 (throw) and T1 (non throwing).

Ok, so it's still ok to hide the mask for FP then. That simplifies things.

bartonjs commented 6 months ago

Video

namespace System.Runtime.Intrinsics.Arm;

/// VectorT Summary
public abstract class Sve : AdvSimd /// Feature: FEAT_SVE  Category: fp
{
  /// T: float, double
  public static unsafe Vector<T> AddRotateComplex(Vector<T> left, Vector<T> right, [ConstantExpected] byte rotation); // FCADD // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> AddSequentialAcross(Vector<T> initial, Vector<T> value); // FADDA // predicated

  /// T: [double, float], [double, int], [double, long], [double, uint], [double, ulong]
  public static unsafe Vector<T> ConvertToDouble(Vector<T2> value); // FCVT or SCVTF or UCVTF // predicated, MOVPRFX

  /// T: [int, float], [int, double]
  public static unsafe Vector<T> ConvertToInt32(Vector<T2> value); // FCVTZS // predicated, MOVPRFX

  /// T: [long, float], [long, double]
  public static unsafe Vector<T> ConvertToInt64(Vector<T2> value); // FCVTZS // predicated, MOVPRFX

  /// T: [float, double], [float, int], [float, long], [float, uint], [float, ulong]
  public static unsafe Vector<T> ConvertToSingle(Vector<T2> value); // FCVT or SCVTF or UCVTF // predicated, MOVPRFX

  /// T: [uint, float], [uint, double]
  public static unsafe Vector<T> ConvertToUInt32(Vector<T2> value); // FCVTZU // predicated, MOVPRFX

  /// T: [ulong, float], [ulong, double]
  public static unsafe Vector<T> ConvertToUInt64(Vector<T2> value); // FCVTZU // predicated, MOVPRFX

  /// T: [float, uint], [double, ulong]
  public static unsafe Vector<T> FloatingPointExponentialAccelerator(Vector<T2> value); // FEXPA

  /// T: float, double
  public static unsafe Vector<T> MultiplyAddRotateComplex(Vector<T> addend, Vector<T> left, Vector<T> right, [ConstantExpected] byte rotation); // FCMLA // predicated, MOVPRFX

  public static unsafe Vector<float> MultiplyAddRotateComplexBySelectedScalar(Vector<float> addend, Vector<float> left, Vector<float> right, [ConstantExpected] byte rightIndex, [ConstantExpected] byte rotation); // FCMLA // MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> ReciprocalEstimate(Vector<T> value); // FRECPE

  /// T: float, double
  public static unsafe Vector<T> ReciprocalExponent(Vector<T> value); // FRECPX // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> ReciprocalSqrtEstimate(Vector<T> value); // FRSQRTE

  /// T: float, double
  public static unsafe Vector<T> ReciprocalSqrtStep(Vector<T> left, Vector<T> right); // FRSQRTS

  /// T: float, double
  public static unsafe Vector<T> ReciprocalStep(Vector<T> left, Vector<T> right); // FRECPS

  /// T: float, double
  public static unsafe Vector<T> RoundAwayFromZero(Vector<T> value); // FRINTA // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> RoundToNearest(Vector<T> value); // FRINTN // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> RoundToNegativeInfinity(Vector<T> value); // FRINTM // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> RoundToPositiveInfinity(Vector<T> value); // FRINTP // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> RoundToZero(Vector<T> value); // FRINTZ // predicated, MOVPRFX

  /// T: [float, int], [double, long]
  public static unsafe Vector<T> Scale(Vector<T> left, Vector<T2> right); // FSCALE // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> Sqrt(Vector<T> value); // FSQRT // predicated, MOVPRFX

  /// T: float, double
  public static unsafe Vector<T> TrigonometricMultiplyAddCoefficient(Vector<T> left, Vector<T> right, [ConstantExpected] byte control); // FTMAD // MOVPRFX

  /// T: [float, uint], [double, ulong]
  public static unsafe Vector<T> TrigonometricSelectCoefficient(Vector<T> value, Vector<T2> selector); // FTSSEL

  /// T: [float, uint], [double, ulong]
  public static unsafe Vector<T> TrigonometricStartingValue(Vector<T> value, Vector<T2> sign); // FTSMUL
}
ghost commented 5 months ago

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics See info in area-owners.md if you want to be subscribed.

Issue Details
```csharp namespace System.Runtime.Intrinsics.Arm /// VectorT Summary public abstract class Sve : AdvSimd /// Feature: FEAT_SVE Category: fp { /// T: float, double public static unsafe Vector AddRotateComplex(Vector op1, Vector op2, ulong imm_rotation); // FCADD // predicated, MOVPRFX /// T: float, double public static unsafe T AddSequentialAcross(T initial, Vector op); // FADDA // predicated /// T: [double, float], [double, int], [double, long], [double, uint], [double, ulong] public static unsafe Vector ConvertToDouble(Vector value); // FCVT or SCVTF or UCVTF // predicated, MOVPRFX /// T: [int, float], [int, double] public static unsafe Vector ConvertToInt32(Vector value); // FCVTZS // predicated, MOVPRFX /// T: [long, float], [long, double] public static unsafe Vector ConvertToInt64(Vector value); // FCVTZS // predicated, MOVPRFX /// T: [float, double], [float, int], [float, long], [float, uint], [float, ulong] public static unsafe Vector ConvertToSingle(Vector value); // FCVT or SCVTF or UCVTF // predicated, MOVPRFX /// T: [uint, float], [uint, double] public static unsafe Vector ConvertToUInt32(Vector value); // FCVTZU // predicated, MOVPRFX /// T: [ulong, float], [ulong, double] public static unsafe Vector ConvertToUInt64(Vector value); // FCVTZU // predicated, MOVPRFX /// T: [float, uint], [double, ulong] public static unsafe Vector FloatingPointExponentialAccelerator(Vector value); // FEXPA /// T: float, double public static unsafe Vector MultiplyAddRotateComplex(Vector op1, Vector op2, Vector op3, ulong imm_rotation); // FCMLA // predicated, MOVPRFX public static unsafe Vector MultiplyAddRotateComplex(Vector op1, Vector op2, Vector op3, ulong imm_index, ulong imm_rotation); // FCMLA // MOVPRFX /// T: float, double public static unsafe Vector ReciprocalEstimate(Vector value); // FRECPE /// T: float, double public static unsafe Vector ReciprocalExponent(Vector value); // FRECPX // predicated, MOVPRFX /// T: float, double public static unsafe Vector ReciprocalSqrtEstimate(Vector value); // FRSQRTE /// T: float, double public static unsafe Vector ReciprocalSqrtStep(Vector left, Vector right); // FRSQRTS /// T: float, double public static unsafe Vector ReciprocalStep(Vector left, Vector right); // FRECPS /// T: float, double public static unsafe Vector RoundAwayFromZero(Vector value); // FRINTA // predicated, MOVPRFX /// T: float, double public static unsafe Vector RoundToNearest(Vector value); // FRINTN // predicated, MOVPRFX /// T: float, double public static unsafe Vector RoundToNegativeInfinity(Vector value); // FRINTM // predicated, MOVPRFX /// T: float, double public static unsafe Vector RoundToPositiveInfinity(Vector value); // FRINTP // predicated, MOVPRFX /// T: float, double public static unsafe Vector RoundToZero(Vector value); // FRINTZ // predicated, MOVPRFX /// T: [float, int], [double, long] public static unsafe Vector Scale(Vector left, Vector right); // FSCALE // predicated, MOVPRFX /// T: float, double public static unsafe Vector Sqrt(Vector value); // FSQRT // predicated, MOVPRFX /// T: float, double public static unsafe Vector TrigonometricMultiplyAddCoefficient(Vector op1, Vector op2, ulong imm3); // FTMAD // MOVPRFX /// T: [float, uint], [double, ulong] public static unsafe Vector TrigonometricSelectCoefficient(Vector left, Vector right); // FTSSEL /// T: [float, uint], [double, ulong] public static unsafe Vector TrigonometricStartingValue(Vector left, Vector right); // FTSMUL /// total method signatures: 26 /// Optional Entries: public static unsafe Vector Scale(Vector left, int right); // FSCALE // predicated, MOVPRFX public static unsafe Vector Scale(Vector left, long right); // FSCALE // predicated, MOVPRFX /// total optional method signatures: 2 } ``` ### Details **TrigonometricMultiplyAddCoefficient** Floating-point trigonometric multiply-add coefficient Calculates the series terms for either sin(x) or cos(x), where the argument x has been adjusted to be in the range -π/4 < x ≤ π/4. To calculate the series terms of sin(x) and cos(x) the initial source operands should be zero in the first source vector and x2 in the second source vector. The operation is then executed eight times to calculate the sum of eight series terms, which gives a result of sufficient precision. The method multiplies each element of the first source vector by the absolute value of the corresponding element of the second source vector and performs a fused addition of each product with a value obtained from a table of hard-wired coefficients, and places the results destructively in the first source vector. The coefficients are different for sin(x) and cos(x), and are selected by a combination of the sign bit in the second source element and an immediate index in the range 0 to 7. See https://docsmirror.github.io/A64/2023-06/ftmad_z_zzi.html for the full coefficient tables. **TrigonometricSelectCoefficient** Floating-point trigonometric select coefficient Selects the coefficient for the final multiplication in the polynomial series approximation. The instruction places the value 1.0 or a copy of the first source vector element in the destination element, depending on bit 0 of the quadrant number q held in the corresponding element of the second source vector. The sign bit of the destination element is copied from bit 1 of the corresponding value of q. To compute sin(x) or cos(x) the instruction is executed with elements of the first source vector set to x, adjusted to be in the range -π/4 < x ≤ π/4. The elements of the second source vector hold the corresponding value of the quadrant q number as an integer not a floating-point value. The value q satisfies the relationship (2q-1) × π/4 < x ≤ (2q+1) × π/4. **TrigonometricStartingValue** Floating-point trigonometric starting value Calculates the initial value for `TrigonometricMultiplyAddCoefficient`. The method squares each element in the first source vector and then sets the sign bit to a copy of bit 0 of the corresponding element in the second source register, and places the results in the destination vector. To compute sin(x) or cos(x) the instruction is executed with elements of the first source vector set to x, adjusted to be in the range -π/4 < x ≤ π/4. The elements of the second source vector hold the corresponding value of the quadrant q number as an integer not a floating-point value. The value q satisfies the relationship (2q-1) × π/4 < x ≤ (2q+1) × π/4.
Author: a74nh
Assignees: -
Labels: `api-approved`, `area-System.Runtime.Intrinsics`
Milestone: 9.0.0
amanasifkhalid commented 2 weeks ago

For AddRotateComplex, the two valid rotation amounts are 90 and 270, the latter of which is too large to fit in a byte. Can we change rotation's type to ushort?

tannergooding commented 2 weeks ago

Can we change rotation's type to ushort?

This is fine for the .NET 9 preview since these APIs are all [Experimental]. We should probably take this API in particular back to API review and determine if it should instead be some enum or kept as an opaque control byte.

This is an interesting case because normally the byte control is used to represent what the assembly uses which is itself often a direct correlation to the underlying bits encoded. The actual instruction basically just encodes 0 (meaning 90) or 1 (meaning 270) but the actual assembly uses #90 and #270, translating them to the underlying bits instead, hence the disconnect here.

tannergooding commented 2 weeks ago

The alternative would be to keep it as byte, to restrict the range to be 0 or 1 and to treat them as the actual encoding does, which is overall simpler for the JIT to handle as well, notably. -- This would also, itself, be the most compatible with exposing an enum variant that uses friendly names for 90 vs 270, so it may be the better option

amanasifkhalid commented 2 weeks ago

The alternative would be to keep it as byte, to restrict the range to be 0 or 1 and to treat them as the actual encoding does, which is overall simpler for the JIT to handle as well, notably.

I agree this would simplify implementation quite a bit; we currently move back and forth between the bit-level and angle representation in the emitter, so it would be nice to make this consistent. Also, passing 0 or 1 means the JIT's logic for emitting a jump table when rotation isn't constant would work as-is.

I think exposing some enum mapped to 0/1 is the best approach, too. Would we have to wait for another API review to do that if these are experimental?

tannergooding commented 2 weeks ago

For exposing an enum, it'd be better to go through API review. Even though its experimental we still want it in the "mostly right" shape.

Going from byte->ushort under the premise that we need to encode 90 or 270 and the latter won't fit in byte is straightforward.

Keeping it as byte and opting to encode the raw control bits (as most other intrinsics already do) is also straightforward and fits cleanly with the model of exposing a friendly enum in the future.

While exposing an enum is an overall riskier change and comes with higher level naming considerations. It also opens the doors for if we need to provide additional enums for similar APIs and how to rationalize the enum for APIs like AddRotateComplex which only take 90/270 and the other *Complex APIs that takes 0/90/180/270


Given all the above and how most hwintrinsics already exist, my preference here would be to keep it as byte and document that it is expected to be 0 or 1 directly matching the underlying instruction encoding. This is the most consistent, the easiest for the JIT to handle, is easily handled by users via named constants if they desire, and best fits a growth story into enums if that ends up being the direction we opt to go.

amanasifkhalid commented 2 weeks ago

Thank you for explaining!

Given all the above and how most hwintrinsics already exist, my preference here would be to keep it as byte and document that it is expected to be 0 or 1 directly matching the underlying instruction encoding. This is the most consistent, the easiest for the JIT to handle, is easily handled by users via named constants if they desire, and best fits a growth story into enums if that ends up being the direction we opt to go.

Agreed. I'll go with this approach for now.

kunalspathak commented 2 weeks ago

Do we need a tracking issue for this or will it be on the agenda of API review commitee?