shibatch / sleef

SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
https://sleef.org
Boost Software License 1.0
665 stars 133 forks source link

Is Sleef scale math function faster than that in math.h? #225

Open megazone87 opened 6 years ago

megazone87 commented 6 years ago

I replace all appearance of #include <math.h> with #include "sleef_math.h" in my project, while sleef_math.h looks like:

#define sin Sleef_sinf_u35
#define cos Sleef_cosf_u35
...

However, i don't see a speed up in my project, actually it degrade a little speed. Is my usage is wrong or sleef scale function not that optimized than that in math.h?

megazone87 commented 6 years ago
shibatch commented 6 years ago

Hello,

Those scalar functions are not optimized. They are provided for easy understanding of how the vectorized version of functions work.

You can try Sleef_sinf1_u35purecfma instead of Sleef_sinf_u35. Those purecfma functions should be faster. These are only included in the git version.

megazone87 commented 6 years ago

I tried Sleef_*f_u35 with the latest master branch:

Firstly, i found it may need compiler flag: -march=native, thus __AVX2__ is defined, otherwise a compile error happend:

error: 'Sleef_expf1_u10purecfma' was not declared in this scope
       x[j] = exp(x[j])

Secondly, the result is still unsatisfied: the speed is just up a little, and still not faster than .

shibatch commented 6 years ago

That means that recent math functions in glibc are pretty fast. SLEEF is a vectorized math library, and it is not meant for scalar computation.

megazone87 commented 6 years ago

You are right, I'm tring to write vectorized code now. But i am a noob for this. Is there any tutorial or example for using sleef? (I do read things like src/libm-tester/tester2simdsp.c, but still feel it not quiet obvious.)

And, there is another similar(?) library: https://github.com/QuantStack/xsimd, would you introduce differences between sleef and other library? If it exists in README will be helpful for people like me.

Thank you!

shibatch commented 6 years ago

Could you tell me a little bit about the purpose of your code?

megazone87 commented 6 years ago

replace the math.h math function by faster implementation (include vectorized SIMD).

megazone87 commented 6 years ago

The project is for speech synthesis, there is a lot of sin cos exp log pow ..

fpetrogalli commented 5 years ago

Hi - this issue has been quite for a while, without taking any direction. Shall we close it?

shibatch commented 5 years ago

Actually I have a plan for this issue, which is to remove sleefdp.c and sleefsp.c, and make the scalar functions aliases to the functions with purecscalar helper.

fpetrogalli commented 5 years ago

make the scalar functions aliases to the functions with purecscalar helper.

What problem would this change solve?

shibatch commented 5 years ago

I am going to introduce a dispatcher to those functions, and they can utilize FMA if available. Then, scalar functions are as fast as vector functions.

blapie commented 11 months ago

I like the idea of removing the scalar implementations (in src/libm/sleef{s,d}p.c), for the sake of keeping maintenance costs low. We actually rarely touch these files but issues have been reported in the past where the scalar routines do not always match the vector ones, which lead to the design of the Sleef_<name>1_u<accuracy>purec{,fma} implementation (vector algo + scalar helper) to ensure reproducibility.

However getting rid of src/libm/sleef{s,d}p.c is not gonna make it easy to understand algorithms and potentially improve them. Besides people shouldn't use these, because as stated here they are now slower than standard implementations.

I'm wondering if maybe we could simply use the system of helpers to generate human-readable documentation or pseudo-code?