Open valadaptive opened 5 months ago
While AVX2's _mm_fmsub_* intrinsics subtract the last operand from the product, ARM's vmsfq_* intrinsics subtract the product from the last operand. This means we can use them to implement a one-instruction neg_mul_add.
_mm_fmsub_*
vmsfq_*
neg_mul_add
While AVX2's
_mm_fmsub_*
intrinsics subtract the last operand from the product, ARM'svmsfq_*
intrinsics subtract the product from the last operand. This means we can use them to implement a one-instructionneg_mul_add
.