arduano / simdeez

easy simd
MIT License
332 stars 25 forks source link

Use vmsfq intrinsic for NEON neg_mul_add #69

Open valadaptive opened 5 months ago

valadaptive commented 5 months ago

While AVX2's _mm_fmsub_* intrinsics subtract the last operand from the product, ARM's vmsfq_* intrinsics subtract the product from the last operand. This means we can use them to implement a one-instruction neg_mul_add.