Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores, etc 🦖
Following the discussion in #137, it would be great to reach some uniformity in feature detection on x86 and Arm. On the latter, we can't yet use SVE, and only differentiate "extended NEON" and serial code. Assuming broader adoption among Arm devices, we need to isolate the features we use in different sub-generations of Arm v8, and consider bit-level operations from SVE.
In other libraries, like SimSIMD, I currently use the Linux API to check for those capabilities. But that is less portable than using inline Assembly, and we may need to detect those features on the upcoming Apple M-series chips.
Following the discussion in #137, it would be great to reach some uniformity in feature detection on x86 and Arm. On the latter, we can't yet use SVE, and only differentiate "extended NEON" and serial code. Assuming broader adoption among Arm devices, we need to isolate the features we use in different sub-generations of Arm v8, and consider bit-level operations from SVE.
https://github.com/ashvardanian/SimSIMD/blob/18d17686124ddebd9fe55eee56b2e0273a613d4b/include/simsimd/simsimd.h#L208-L228
In other libraries, like SimSIMD, I currently use the Linux API to check for those capabilities. But that is less portable than using inline Assembly, and we may need to detect those features on the upcoming Apple M-series chips.