exafmm / exafmm-t

A kernel-independent fast multipole method library with Python interface.
BSD 3-Clause "New" or "Revised" License
57 stars 13 forks source link

Add simd intrinsics (sin, cos, exp, ...) for non-Intel compiler #10

Open tingyu66 opened 5 years ago

tingyu66 commented 5 years ago

Trigonometric functions, ex. _mm256_sin_ps, are now only available when using intel compiler and intel svml library found. We could integrate the intrinsics below into vec.h to improve the performance of helmholtz kernel when using gcc.

http://software-lisc.fbk.eu/avx_mathfun/avx_mathfun.h http://gruntthepeon.free.fr/ssemath/

rioyokota commented 5 years ago

This is something that was done in the previous exafmm using Agner Fog's vectorclass. https://www.agner.org/optimize/vectorclass/read.php?i=2 It's like a more complete version of vec.h with all possible functions implemented for SSE, AVX, AVX2, AVX512. I stopped using it because it had some problems on some machines we were using. Maybe these problems have been fixed so we can use it again.

rioyokota commented 5 years ago

Take a look at how it was done in the older versions of exafmm. I don't remember exactly which version of exafmm it was. I think it was exafmm-beta in an older revision.