shibatch / sleef

SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
https://sleef.org
Boost Software License 1.0
628 stars 126 forks source link

Integrate ARM-software/optimized-routines into sleef? #542

Open helloguo opened 3 months ago

helloguo commented 3 months ago

https://github.com/ARM-software/optimized-routines/tree/master/math/aarch64 implements some math operations with neon simd instructions. The perf looks good, especially for exp(). I'm wondering if it's possible to integrate ARM-software/optimized-routines into sleef? Thanks!

blapie commented 3 months ago

Hello! Thank you very much for the suggestion. It turns out there is significant overlap in the team developing Arm optimized routines (AOR) and maintaining SLEEF at the minute.

However, SLEEF and AOR rely on very different design principles, which make it complicated to connect the two libraries.

SLEEF puts emphasis on (cross-architecture) portability as well as providing features like reproducibility. This limits significantly the amount of architecture specific optimizations that can be done in SLEEF. Additionally, SLEEF is designed to rely as little as possible on table lookups while AOR relies heavily on them.

On the other hand, AOR vector implementations are specifically optimized for AArch64, even on a micro-architecture level. AOR algorithms rely heavily on table lookup. Finally, AOR has its own set of polynomial coefficients, re-using/re-destributing them within SLEEF raises questions about licensing.

These fundamental differences make it very hard (if not impossible) to integrate AOR algorithms as is into SLEEF, that being said it does not mean the Arm interfaces (i.e., AdvSIMD and SVE helpers) cannot be optimized, for instance picking a better set of instructions to implement a given helper routine, improving the way polynomial coefficients are loaded and used. Some of these changes are fairly generic and could also very well have positive impact on all architectures.

Hopefully, when we are settled in our maintainer job and most pressing issues have been addressed we can focus on this type of optimization work. We will document some of the potential optimisation and as always feel free to contribute we will be more than happy to help in code reviews and threads!

helloguo commented 3 months ago

@blapie Thanks for the detailed explanation!