Open kunalspathak opened 3 months ago
Recommendation for how to implement. Examples of this can be found in 100134
Copy/paste contents from files in https://github.com/a74nh/runtime/tree/api_github/sve_api/out_cs_api/ . There should be no need to edit these changes. Keep alphabetical ordering.
The same files have been given additional annotation and can be found in https://github.com/a74nh/runtime/tree/api_github/sve_api/out_helper_api . These are for development use only and are not for commiting.
Copy/paste from https://github.com/a74nh/runtime/blob/api_github/sve_api/out_hwintrinsiclistarm64sve.h For entries with multiple instructions for a single type, this will need fixing via a special code path. The flags and category columns will probably need manually fixing. Flags that are not automatically detected:
HW_Flag_LowMaskedOperation
: The predicate in arg1 is 0-7HW_Flag_HasRMWSemantics
: src1 and dest use the same register.HW_Flag_EmbeddedMaskedOperation
: APIs that have just have "predicated" version. These APIs are converted into ConditionalSelect(AllTrue, CALL_API(operands...), Zero)
to get the effect of "predicate" registers. E.g. Abs
, Divide
.HW_Flag_OptionalEmbeddedMaskedOperation
: APIs that have both "predicated" and "unpredicated" version. These APIs can be used stand alone, for which "unpredicated" version of the instruction will be generated. They can also be wrapped in ConditionalSelect
in a user code and in which case, "predicated" version of the instruction will be emitted. E.g. Add
, Multiply
, etc.HW_Flag_ExplicitMaskedOperation
: These APIs take "mask" explicitly as the first argument. E.g. ConditionalSelect
HW_Flag_Scalable
: All APIs have this flag to identify that they operate on scalable vector length.For any special case where there is no flag, you have options:
HW_Flag_SpecialCodeGen
and add a new case to CodeGen::genHWIntrinsic()
.HW_Category_Special
and add a new case to Compiler::impSpecialIntrinsic()
HW_Flag_SpecialCodeGen
and HW_Category_Special
Copy/paste from https://raw.githubusercontent.com/a74nh/runtime/api_github/sve_api/out_GenerateHWIntrinsicTests_Arm.cs Rename the template (first column) to a more generic template. We want as few new templates as possible. Existing AdvSimd templates can be copied and then edited to include extra Sve parts. The ValidateIterResult and NextValueOpN entires will need editing to fit the template. Use existing entires as a guide.
Tests can be build using:
rm -fr ./artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/
./src/tests/build.sh checked -test:JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro.csproj
Tests can then be run:
./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.sh
Generated C# files are in artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/
There are a lot of tests that will be run. To make life easier run the .dll directly and pass it the name of the test (a substring will do). Eg:
$CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve_Add_uint
$CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve
Tests can be build using:
del /F /S /Q repo\artifacts\tests\coreclr\obj\windows.arm64.Release\Managed\JIT\HardwareIntrinsics\Arm\Sve\
pushd repo\src\tests\
build.cmd Release -test JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_r.csproj /p:TargetArchitecture=arm64
build.cmd Release -test JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_ro.csproj /p:TargetArchitecture=arm64
Tests can then be run:
pushd repo\artifacts\tests\coreclr\windows.arm64.Release\JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_r
HardwareIntrinsics_Arm_r.cmd
Generated C# files are in artifacts\tests\coreclr\obj\windows.arm64.Release\Managed\JIT\HardwareIntrinsics\Arm\Sve\Sve_ro\Sve_ro\gen\
There are a lot of tests that will be run. To make life easier run the .dll directly and pass it the name of the test (a substring will do). Eg:
$CORE_ROOT\corerun .\artifacts\tests\coreclr\windows.arm64.Release\JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_ro\HardwareIntrinsics_Arm_ro.dll Sve_Add_uint
$CORE_ROOT\corerun .\artifacts\tests\coreclr\windows.arm64.Release\JIT\HardwareIntrinsics\HardwareIntrinsics_Arm_ro\HardwareIntrinsics_Arm_ro.dll Sve
All the testing works as usual using AltJit*
environment variables. Only thing to remember is to set additional environment variable DOTNET_MaxVectorTBitWidth=128
to avoid getting asserts assert(size == info.compCompHnd->getClassSize(typeHnd));
All the tests should be run using all the various stress modes. https://github.com/a74nh/runtime/blob/api_github/sve_api/stress_tester.py is used to run your test in the various modes. Pass it the full command line for running your test. Eg:
stress_tester.py $CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve_Add_uint
.cs
files have been created, you can edit then manually and rebuild. Copy the changes back to the template once the test works. This can save time fiddling with template params.vector<T>
, this means you can treat it like a normal vector and itterate through it, set values etc. A mask should only contain the values 0
or 1
. Within the jit it will be converted to/from a vector of boolean value so that they can be placed in the SVE predicate registers (p0
to p15
).For choosing APIs.
Sve Maths
methods).
HW_Flag_EmbeddedMaskedOperation
in out_hwintrinsiclistarm64sve.hAs the testing grows it will increasingly become difficult to test just a single API. this is ok during CI, but painful during development and bug fixing.
I recommend someone writes a patch so that a testname can be passed in as an argument so that only that test will run. Eg:
HardwareIntrinsics_Arm_ro.sh Sve.Add.uint
I recommend someone writes a patch so that a testname can be passed in as an argument so that only that test will run. Eg:
HardwareIntrinsics_Arm_ro.sh Sve.Add.uint
I agree. I have asked @TIHan to come up with a design for this. @TIHan - any update on this?
I have not looked at this yet, but can this week.
Just noting such support should already exist if you invoke the underlying dll
directly, this may just be something missing from the .sh
file.
The exact argument that matches a filter may be a bit different due to it now using the underlying xunit
filtering mechanic, but it should largely just work.
You can then see some of the logic that gets setup via https://github.com/dotnet/runtime/blob/main/src/tests/Common/XUnitWrapperGenerator/XUnitWrapperGenerator.cs and the corresponding logic of how the test filtering works here: https://github.com/dotnet/runtime/blob/main/src/tests/Common/XUnitWrapperLibrary/TestFilter.cs
The actual filter is constructed like:
System.Collections.Generic.Dictionary<string, string> testExclusionTable = XUnitWrapperLibrary.TestFilter.LoadTestExclusionTable();
XUnitWrapperLibrary.TestFilter filter = new (args, testExclusionTable);
A given TestExecutor
then uses it like:
void TestExecutor1(System.IO.StreamWriter tempLogSw, System.IO.StreamWriter statsCsvSw)
{
if (filter is null || filter.ShouldRunTest(@"JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble", "_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()"))
{
System.TimeSpan testStart = stopwatch.Elapsed;
try
{
summary.ReportStartingTest("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()", System.Console.Out);
outputRecorder.ResetTestOutput();
_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble();
summary.ReportPassedTest("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()", "JIT.HardwareIntrinsics.Arm._AdvSimd.Program", @"AddDouble", stopwatch.Elapsed - testStart, outputRecorder.GetTestOutput(), System.Console.Out, tempLogSw, statsCsvSw);
}
catch (System.Exception ex)
{
summary.ReportFailedTest("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()", "JIT.HardwareIntrinsics.Arm._AdvSimd.Program", @"AddDouble", stopwatch.Elapsed - testStart, ex, outputRecorder.GetTestOutput(), System.Console.Out, tempLogSw, statsCsvSw);
}
}
else
{
string reason = filter.GetTestExclusionReason("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()");
summary.ReportSkippedTest("_AdvSimd_r::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AddDouble()", "JIT.HardwareIntrinsics.Arm._AdvSimd.Program", @"AddDouble", System.TimeSpan.Zero, reason, tempLogSw, statsCsvSw);
}
}
where ShouldRunTest
basically just does a stringToSearch.Contains(filter)
check at the most basic level
Just noting such support should already exist if you invoke the underlying
dll
directly, this may just be something missing from the.sh
file.
If that's the case, can you or @Tihan can come up with the exact command line that is needed to run a particular case. I don't want engineer to hack around a test to make it working for every API.
It appears to work:
❯ $CORE_ROOT/corerun ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve_Add_uint
16:34:55.071 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_Add_uint()
Supported ISAs:
AdvSimd: True
Aes: True
ArmBase: True
Crc32: True
Dp: True
Rdm: True
Sha1: True
Sha256: True
Sve: True
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
16:34:55.177 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_Add_uint()
I'm happy with this as a solution then!
Updated the implementation instructions with stress testing and how to write the tests.
Updated the implementation instructions with stress testing and how to write the tests.
Updated for Windows.
Updated https://github.com/dotnet/runtime/issues/99957#issuecomment-2007408474 with meanings of various HWIntrinsicFlag
values used in the table and their meaning.
Now that all the SVE instructions encoding is completed in https://github.com/dotnet/runtime/issues/94549, it is time to expose these instructions through .NET APIs. Here is the list of categorized APIs with links to the issue where they were approved.
.NET 9 Goal: We aim to complete SVE APIs in .NET 9. SVE2 APIs will be pushed out to .NET 10.
SVE APIs
Full list
- [x] DuplicateSelectedScalarToVector https://github.com/dotnet/runtime/pull/103228 - [x] ReverseBits https://github.com/dotnet/runtime/pull/103806 - [x] ReverseElement https://github.com/dotnet/runtime/pull/102991 - [x] ReverseElement16 https://github.com/dotnet/runtime/pull/102991 - [x] ReverseElement32 https://github.com/dotnet/runtime/pull/102991 - [x] ReverseElement8 https://github.com/dotnet/runtime/pull/102991 - [x] Splice https://github.com/dotnet/runtime/pull/103567 - [x] TransposeEven https://github.com/dotnet/runtime/pull/103068 - [x] TransposeOdd https://github.com/dotnet/runtime/pull/103068 - [x] UnzipEven https://github.com/dotnet/runtime/pull/101294 - [x] UnzipOdd https://github.com/dotnet/runtime/pull/101294 - [x] VectorTableLookup https://github.com/dotnet/runtime/pull/103989 - [x] ZipHigh #101294 - [x] ZipLow #101294Full list
- [x] Compute16BitAddresses https://github.com/dotnet/runtime/pull/103040 - [x] Compute32BitAddresses https://github.com/dotnet/runtime/pull/103040 - [x] Compute64BitAddresses https://github.com/dotnet/runtime/pull/103040 - [x] Compute8BitAddresses https://github.com/dotnet/runtime/pull/103040 - [x] LoadVector https://github.com/dotnet/runtime/pull/98218 - [x] LoadVector128AndReplicateToVector https://github.com/dotnet/runtime/pull/103392 - [x] LoadVectorByteNonFaultingZeroExtendToInt16 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorByteNonFaultingZeroExtendToInt32 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorByteNonFaultingZeroExtendToInt64 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorByteNonFaultingZeroExtendToUInt16 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorByteNonFaultingZeroExtendToUInt32 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorByteNonFaultingZeroExtendToUInt64 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorByteZeroExtendToInt16 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorByteZeroExtendToInt32 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorByteZeroExtendToInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorByteZeroExtendToUInt16 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorByteZeroExtendToUInt32 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorByteZeroExtendToUInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorInt16NonFaultingSignExtendToInt32 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorInt16NonFaultingSignExtendToInt64 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorInt16NonFaultingSignExtendToUInt32 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorInt16NonFaultingSignExtendToUInt64 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorInt16SignExtendToInt32 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorInt16SignExtendToInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorInt16SignExtendToUInt32 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorInt16SignExtendToUInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorInt32NonFaultingSignExtendToInt64 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorInt32NonFaultingSignExtendToUInt64 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorInt32SignExtendToInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorInt32SignExtendToUInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorNonFaulting https://github.com/dotnet/runtime/pull/103392 - [x] LoadVectorNonTemporal https://github.com/dotnet/runtime/pull/103392 - [x] LoadVectorSByteNonFaultingSignExtendToInt16 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorSByteNonFaultingSignExtendToInt32 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorSByteNonFaultingSignExtendToInt64 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorSByteNonFaultingSignExtendToUInt16 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorSByteNonFaultingSignExtendToUInt32 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorSByteNonFaultingSignExtendToUInt64 https://github.com/dotnet/runtime/pull/102903 - [x] LoadVectorSByteSignExtendToInt16 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorSByteSignExtendToInt32 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorSByteSignExtendToInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorSByteSignExtendToUInt16 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorSByteSignExtendToUInt32 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorSByteSignExtendToUInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorUInt16NonFaultingZeroExtendToInt32 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorUInt16NonFaultingZeroExtendToInt64 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorUInt16NonFaultingZeroExtendToUInt32 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorUInt16NonFaultingZeroExtendToUInt64 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorUInt16ZeroExtendToInt32 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorUInt16ZeroExtendToInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorUInt16ZeroExtendToUInt32 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorUInt16ZeroExtendToUInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorUInt32NonFaultingZeroExtendToInt64 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorUInt32NonFaultingZeroExtendToUInt64 https://github.com/dotnet/runtime/pull/102860 - [x] LoadVectorUInt32ZeroExtendToInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorUInt32ZeroExtendToUInt64 https://github.com/dotnet/runtime/pull/101291 - [x] LoadVectorx2 https://github.com/dotnet/runtime/pull/102180 - [x] LoadVectorx3 https://github.com/dotnet/runtime/pull/102180 - [x] LoadVectorx4 https://github.com/dotnet/runtime/pull/102180 - [x] PrefetchBytes https://github.com/dotnet/runtime/pull/103094 - [x] PrefetchInt16 https://github.com/dotnet/runtime/pull/103094 - [x] PrefetchInt32 https://github.com/dotnet/runtime/pull/103094 - [x] PrefetchInt64 https://github.com/dotnet/runtime/pull/103094Full list
- [x] Store https://github.com/dotnet/runtime/pull/102262 - [x] StoreNarrowing https://github.com/dotnet/runtime/pull/102605 - [x] StoreNonTemporal https://github.com/dotnet/runtime/pull/102769Full list
- [x] Abs https://github.com/dotnet/runtime/pull/100743 - [x] AbsoluteDifference https://github.com/dotnet/runtime/pull/102170 - [x] Add https://github.com/dotnet/runtime/pull/100743 - [x] AddAcross https://github.com/dotnet/runtime/pull/101674 - [x] AddSaturate https://github.com/dotnet/runtime/pull/102170 - [x] Divide https://github.com/dotnet/runtime/pull/101578 - [x] DotProduct https://github.com/dotnet/runtime/pull/102218 - [x] DotProductBySelectedScalar https://github.com/dotnet/runtime/pull/102218 - [x] FusedMultiplyAdd https://github.com/dotnet/runtime/pull/102007 - [x] FusedMultiplyAddBySelectedScalar https://github.com/dotnet/runtime/pull/102007 - [x] FusedMultiplyAddNegated https://github.com/dotnet/runtime/pull/102007 - [x] FusedMultiplySubtract https://github.com/dotnet/runtime/pull/102007 - [x] FusedMultiplySubtractBySelectedScalar https://github.com/dotnet/runtime/pull/102007 - [x] FusedMultiplySubtractNegated https://github.com/dotnet/runtime/pull/102007 - [x] Max https://github.com/dotnet/runtime/pull/101859 - [x] MaxAcross https://github.com/dotnet/runtime/pull/101859 - [x] MaxNumber https://github.com/dotnet/runtime/pull/101859 - [x] MaxNumberAcross https://github.com/dotnet/runtime/pull/101859 - [x] Min https://github.com/dotnet/runtime/pull/101859 - [x] MinAcross https://github.com/dotnet/runtime/pull/101859 - [x] MinNumber https://github.com/dotnet/runtime/pull/101859 - [x] MinNumberAcross https://github.com/dotnet/runtime/pull/101859 - [x] Multiply https://github.com/dotnet/runtime/pull/101578 - [x] MultiplyAdd https://github.com/dotnet/runtime/pull/102007 - [x] MultiplyBySelectedScalar https://github.com/dotnet/runtime/pull/102007 - [x] MultiplyExtended https://github.com/dotnet/runtime/pull/102170 - [x] MultiplySubtract https://github.com/dotnet/runtime/pull/102007 - [x] Negate https://github.com/dotnet/runtime/pull/102170 - [x] SignExtend16 https://github.com/dotnet/runtime/pull/101702 - [x] SignExtend32 https://github.com/dotnet/runtime/pull/101702 - [x] SignExtend8 https://github.com/dotnet/runtime/pull/101702 - [x] SignExtendWideningLower https://github.com/dotnet/runtime/pull/101743 - [x] SignExtendWideningUpper https://github.com/dotnet/runtime/pull/101743 - [x] Subtract https://github.com/dotnet/runtime/pull/101578 - [x] SubtractSaturate https://github.com/dotnet/runtime/pull/102170 - [x] ZeroExtend16 https://github.com/dotnet/runtime/pull/101702 - [x] ZeroExtend32 https://github.com/dotnet/runtime/pull/101702 - [x] ZeroExtend8 https://github.com/dotnet/runtime/pull/101702 - [x] ZeroExtendWideningLower https://github.com/dotnet/runtime/pull/101743 - [x] ZeroExtendWideningUpper https://github.com/dotnet/runtime/pull/101743Full list
- [x] Count16BitElements https://github.com/dotnet/runtime/pull/101188 - [x] Count32BitElements https://github.com/dotnet/runtime/pull/101188 - [x] Count64BitElements https://github.com/dotnet/runtime/pull/101188 - [x] Count8BitElements https://github.com/dotnet/runtime/pull/101188 - [x] GetActiveElementCount https://github.com/dotnet/runtime/pull/102813 - [x] LeadingSignCount https://github.com/dotnet/runtime/pull/102548 - [x] LeadingZeroCount https://github.com/dotnet/runtime/pull/102548 - [x] PopCount https://github.com/dotnet/runtime/pull/102548 - [x] SaturatingDecrementBy16BitElementCount https://github.com/dotnet/runtime/pull/102315 - [x] SaturatingDecrementBy32BitElementCount https://github.com/dotnet/runtime/pull/102315 - [x] SaturatingDecrementBy64BitElementCount https://github.com/dotnet/runtime/pull/102315 - [x] SaturatingDecrementBy8BitElementCount https://github.com/dotnet/runtime/pull/102315 - [x] SaturatingDecrementByActiveElementCount https://github.com/dotnet/runtime/pull/102994 - [x] SaturatingIncrementBy16BitElementCount https://github.com/dotnet/runtime/pull/102315 - [x] SaturatingIncrementBy32BitElementCount https://github.com/dotnet/runtime/pull/102315 - [x] SaturatingIncrementBy64BitElementCount https://github.com/dotnet/runtime/pull/102315 - [x] SaturatingIncrementBy8BitElementCount https://github.com/dotnet/runtime/pull/102315 - [x] SaturatingIncrementByActiveElementCount https://github.com/dotnet/runtime/pull/102994Full list
- [x] GatherPrefetch16Bit https://github.com/dotnet/runtime/pull/103826 - [x] GatherPrefetch32Bit https://github.com/dotnet/runtime/pull/103826 - [x] GatherPrefetch64Bit https://github.com/dotnet/runtime/pull/103826 - [x] GatherPrefetch8Bit https://github.com/dotnet/runtime/pull/103826 - [x] GatherVector https://github.com/dotnet/runtime/pull/103159 - [x] GatherVectorByteZeroExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorInt16SignExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorInt16WithByteOffsetsSignExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorInt32SignExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorInt32WithByteOffsetsSignExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorSByteSignExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorUInt16WithByteOffsetsZeroExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorUInt16ZeroExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorUInt32WithByteOffsetsZeroExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorUInt32ZeroExtend https://github.com/dotnet/runtime/pull/103370 - [x] GatherVectorWithByteOffsets https://github.com/dotnet/runtime/pull/103564SVE2 APIs
Sve2 scatterstores
Sve2 maths
Sve2 mask
Sve2 gatherloads
Sve2 fp
Sve2 counting
Sve2 bitwise
Sve2 bitmanipulate
SveBf16
SveF32mm
SveF64mm
SveFp16
SveI8mm
Sha3
Sm4
SveAes
SveBitperm
SveSha3
SveSm4
Credits to @a74nh for populating the list and also some files in https://github.com/a74nh/runtime/tree/api_github/sve_api that will help to implement them.
Contributes to https://github.com/dotnet/runtime/issues/93095