Open easyaspi314 opened 1 year ago
@easyaspi314 Hi, we're looking into implementing Hellium/MVE
(ARM v8.1-M
M-profile Vector Extension) - at least parts of those and eventually whole extension. You're idea looks to be really helpful here, since huge part of this extension overlaps with ARMv8 A-profile neon intrinsics. It could share both the implementation and tests with neon implementation. Would you be interested in providing guidance and review in this part of the task? My initial idea was adding ACLE (Arm C Language Extension) directory for common Neon and Hellium parts, but seeing that there is more potential of sharing code with other ISA, it could be done that way. My only concern about this approach is how would unit tests look, if we've gone this route? It's pretty important for us to be able to test all codepaths no matter the architecture. We care about code shareing between ARMv8-A/ARMv8-M and x86-64 to degree possible as our main goal, but we're aware our code base might be used on RISC-V/ARMv9 (SVE) too.
One thing about SIMD implementations is that there is often a direct equivalent of each intrinsic. Basically everything has add, sub, shift right, etc,.
Therefore, there could be a "common" folder containing the more common intrinsics. Then, we can just reuse this in the platform-specific intrinsic polyfills and avoid any copy-paste errors/missed optimizations.
I would just use an extension of NEON types since NEON has the strongest type system.
This also lets us elegantly handle differing native vector sizes by divide and conquer:
I am mostly proposing this because removing MMX will need a massive rewrite anyways, so if any large changes are to be made it would be the best time to do it, and we might as well try to reap the benefits of widening 64-bit vectors on all 128-bit only platforms.
Obviously this can be a gradual change.