Open mcourteaux opened 1 month ago
This article seems amazing reference:
https://mazzo.li/posts/vectorized-atan2.html
You may assign me, I think I'll do it. I think I'm seeing bad performance due to 8 calls to glibc's atan2f, instead of something that vectorizes cleanly.
atan2f
Or this one, indeed: https://github.com/boulos/syrah/blob/4ac08d54daa09fc4e7ac8424898d21deda18e103/src/include/syrah/FixedVectorMath.h#L288-L348
Tagging zvookin because he's looked into doing this for some other similar cases (eg tanh)
This article seems amazing reference:
https://mazzo.li/posts/vectorized-atan2.html
You may assign me, I think I'll do it. I think I'm seeing bad performance due to 8 calls to glibc's
atan2f
, instead of something that vectorizes cleanly.