This PR adds a wide range of different NEON, SVE2, SME2 instructions with regressions tests. These facilitate a subset of some internal SME-based GEMM and GEMV codes.
There is some BF16 prototypical instruction support which by default is disabled (using a new build option and an if statement in each appropriate switch statement case) due to some usage of __bf16 which is not compiler agnostic, some hacky usage of memcpy to re-interpret uint16_t, and a lack of regression tests for the BF16 instructions in question.
These BF16 instructions can be enabled through a new CMake option -DSIMENG_ENABLE_BF16=ON. I have deliberately not included this in the documentation given the possible instibility of the BF16 implementation and to keep it for (mainly) internal usage only.
This branch is based on sme2-support (PR #429 ) and so should be merged after this brnch has been merged into dev.
Some SM2 instructions which use multi-vector operands can be non-trivial to read or understand. Please ask for clarification and suggest any additional comments that may help future understanding.
This PR adds a wide range of different NEON, SVE2, SME2 instructions with regressions tests. These facilitate a subset of some internal SME-based GEMM and GEMV codes.
There is some BF16 prototypical instruction support which by default is disabled (using a new build option and an if statement in each appropriate switch statement case) due to some usage of
__bf16
which is not compiler agnostic, some hacky usage of memcpy to re-interpretuint16_t
, and a lack of regression tests for the BF16 instructions in question.These BF16 instructions can be enabled through a new CMake option
-DSIMENG_ENABLE_BF16=ON
. I have deliberately not included this in the documentation given the possible instibility of the BF16 implementation and to keep it for (mainly) internal usage only.This branch is based on
sme2-support
(PR #429 ) and so should be merged after this brnch has been merged intodev
.Some SM2 instructions which use multi-vector operands can be non-trivial to read or understand. Please ask for clarification and suggest any additional comments that may help future understanding.