C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
BSD 3-Clause "New" or "Revised" License
2.15k
stars
253
forks
source link
Use fma and fms instruction when available to speedup complex multiply #1003
Closed
serge-sans-paille closed 6 months ago
This leverage the specific layout of xsimd batch of complex.