jhjourdan / SIMD-math-prims

Vectorizable implementations of some mathematical functions
MIT License
102 stars 12 forks source link

Request: combined sin/cos function #4

Closed dacmot closed 3 years ago

dacmot commented 4 years ago

Bonjour. Would you be interested in creating an approximated function that would calculate both sine and cosine, and which would be faster than individual calls to both? Many SIMD implementations like vectorclass and avxmathfun have such function, although their performance is still slower than computing both sin and cos with SIMD-math-prims. I was wondering if extra performance could be squeezed out. Thanks!

jhjourdan commented 4 years ago

Thanks for the feature request!

However, I don't really have time to wok on this for now. Have you tried to simply e.g., compute sine and deduce cosine via the formula cos(x) = sqrt(1 - sin²(x)) when x is in an appropriate interval? Appart from that, I have no idea how to do this and have a significant performance gain in SIMD-math-prims.

I believe that vectorclass and avxmathfun may be able to gain some performance since this enable them to save the computation of a division (which is needed to move the input to a normalized interval close to 0). SIMD-math-prims do not do this, since it assumes the input is already normalized.

dacmot commented 4 years ago

I admit, I have not tried obtaining it using the identity. I will try and let you know if it's faster than computing the other of sin/cos approximation. Square root is fairly fast, and the argument can be calculated with a single FMA instruction. The sign is a bit more of an issue. Assuming [-pi,pi], the sign for sine may simply be the sign(theta) which should be fast, but for cosine I'm not seeing an easy way other than two if/blend.

jhjourdan commented 3 years ago

There has been no activity in this issue for a long time. I assume this is no longer a problem. I am closing. Feel free to complain if this is still a problem for you.