Closed echesakov closed 6 months ago
Tagging subscribers to this area: @JulieLeeMSFT See info in area-owners.md if you want to be subscribed.
Author: | echesakovMSFT |
---|---|
Assignees: | - |
Labels: | `arch-arm64`, `area-CodeGen-coreclr`, `User Story` |
Milestone: | - |
will it fix div
on x64 returning result and mod in two registers? 🙂
will it fix
div
on x64 returning result and mod in two registers? 🙂
Not this work item, which I tend to think of as "LSRA allocating spans of registers" work.
However #64864 should lay foundation for DivRem
(or MultiplyNoFlags2
) intrinsics that returns pairs of values. #64857 is another relevant piece and needed to avoid unnecessary mov-s with multireg intrinsics.
In #66551, I've actually made multi-reg DivRem on x64 works. @EgorBo you can check if there are works remaining.
Moved to .NET 8.
We will continue working on https://github.com/dotnet/runtime/issues/84510 in .NET 9.
This is completed.
Overview
We achieved parity with x64 for Arm64 intrinsics support in .NET 5 for most of them except for multi-register intrinsics. We need more work to enable multi-register intrinsics for Arm64. The work is integral in that it involves changes in JIT, libraries and mono to enable working intrinsics.
Work Items
LoadPairVector64/128
in the libraries (see comment)V0-V2
(note that this is different from theLoadPairVector64/128
which returns result in two independent SIMD registers). https://github.com/dotnet/runtime/issues/39457. Implemented in #80297.LoadVector
andStoreVector
APIs on Arm64 (https://github.com/dotnet/runtime/issues/84510)LoadVector
andStoreVector
APIs on Arm64 (such as ones that will exposeLD[1-4]
,ST[1-4]
instructions)TBL
,TBX
instructions)Follow-up (after the JIT work is completed)
Benchmarks to use
category:cq theme:register-allocator skill-level:expert cost:medium impact:medium