SeisSol / PSpaMM

BSD 3-Clause "New" or "Revised" License
2 stars 1 forks source link

Add AVX2+FMA backend #6

Closed davschneller closed 1 year ago

davschneller commented 1 year ago

Add an AVX2+FMA backend, i.e. using only 256-bit registers. A caveat is that we cannot use in-operator-broadcasting as AVX512 (and soon maybe also AVX10) can do, i.e. we need more B registers, as done in ARM.

This PR also adds some tests, all of which pass at least on my local machine.

(the code is mostly taken from the KNL and ARM implementations; specific optimizations for AVX2+FMA alone are not implemented there yet)

(but still, the alpha and beta broadcasting registers may be used rather suboptimally, i.e. it may be better to keep the values in other, smaller registers and broadcast only once needed... So that the registers can be used for A and B values otherwise)

krenzland commented 1 year ago

Thanks @montrie, @davschneller!