Adds nfloat_addmul, nfloat_submul and _nfloat_vec_addmul_scalar and _nfloat_vec_submul_scalar with partially inlined code for speed.
The accuracy is identical to doing a mul followed by an add; it would be nice to have something with increased accuracy, or with better speed when the product is much smaller than the sum, but I can't be bothered with all the extra addition code to implement something like that right now.
Adds
nfloat_addmul
,nfloat_submul
and_nfloat_vec_addmul_scalar
and_nfloat_vec_submul_scalar
with partially inlined code for speed.The accuracy is identical to doing a
mul
followed by anadd
; it would be nice to have something with increased accuracy, or with better speed when the product is much smaller than the sum, but I can't be bothered with all the extra addition code to implement something like that right now.