akielaries / openGPMP

Hardware Accelerated General Purpose Mathematics Package
https://akielaries.github.io/openGPMP/
MIT License
8 stars 3 forks source link

intrinsics for CPUs not supporting AVX #92

Closed akielaries closed 7 months ago

akielaries commented 10 months ago

Look into support for SSE and AVX intrinsics for supporting x86 platforms.

akielaries commented 10 months ago

Since many processors support many instruction sets like MMX, SSE, SSE2, SSE3, AVX, AVX2... determine how to use the highest order (those seem to be the fastest due to increased register widths)

akielaries commented 10 months ago

include individual headers of ISAs instead of immintrin as a whole?

akielaries commented 10 months ago

benchmark vs OpenBLAS. openGPMP out performs on Skylake but not Xeon except gpmp fortran routines... dig into this