Open EvanBalster opened 4 years ago
(not a maintainer of this repo)
Nice! By the way, this confused me for a second:
RMS: 0.0220738 mean: 8.36081e-06 min: -0.0287746 @ 2.88496 max: 0.0573099 @ 3.99998
Wouldn't it make sense to (also) calculate a mean error value with only absolute values?
Hello —
I'm an audio/DSP programmer... Depending on the application, different error metrics might be more important or have different significance. The optimal constants change based on what type of error you're trying to minimize. Fastapprox minimizes "infinity-norm" error, or max(abs(approx-actual))
.
In signal processing, the mean error-value could represent a zero-frequency "DC offset" introduced into the resulting signal, or a small bias in a measurement. The RMS could predict harmonic interference. Even when optimizing against absolute max error, it's useful to know the minimum as well, in case additional functions are applied that magnify one more than the other.
Lstly, when searching for optimal constants, evaluating min and max together permits a nifty heuristic: when the function's infinity-norm error is close to optimal, the minimum and maximum are very similar in magnitude.
[Tagging @pmineiro in case he has any thoughts about this PR.]
it's kinda easier to reason about the exponent like this don't you think?
log2exponent.i = 0x4b000000u | (input.i >> 23u); log2exponent.f -= 8388735.;
also the log2 curve can be approximated really well with a polynomial, no need for division
what I did for exp2 if anyone is interested, polynomial curve, exact fractional part
// ./lolremez --float -d 4 -r "1:2" "exp2(x-1)"
// rms error 0.000003
// max error 0.000004
float exp2xminus1(float x){
float u = 1.3697664e-2f;
u = u * x + -3.1002997e-3f;
u = u * x + 1.6875336e-1f;
u = u * x + 3.0996965e-1f;
return u * x + 5.1068333e-1f;
}
float fastexp2 (float p){
//exp2(floor(x))*exp2(fract(x)) == exp2(x)
// 383 is 2^8+127
union {float f; uint32_t i;} u = {p + 383.f}, fract;
// shove the mantissa bits into the exponent
u.i<<=8u;
//the remaining mantissa bits are the fract of p
fract.i = 0x3f800000u | (u.i & 0x007fff00u);
//optional to fix precision loss
fract.f += p-((p+383.f)-383.f);
//fract.f-=1.;
//only take the exponent
u.i = u.i & 0x7F800000;
return u.f * exp2xminus1(fract.f);
}
A log1p
implementation would be nice to have too.
I've been experimenting with fast approximation functions as an evening hobby lately. This started with a project to generalize the fast-inverse-square-root hack, and I may create another pull request to add a collection of fast roots and inverse roots to fastapprox.
With this PR, I've modified the scalar functions to use fast float-to-int conversion, avoiding int/float casts.
This results in an appreciable improvement in speed and appears to improve accuracy as well (I did a little tuning). The modified functions are also a little more fault tolerant, with
log
functions acting aslog(abs(x))
andexp
functions flushing to zero for very small arguments.These changes should be simple to generalize to SSE but I've decided to submit this PR without those changes. I believe small further improvements in the accuracy of fastlog2 and fastpow2 are possible by experimenting with the coefficients, but in any case all modified functions exhibit a reduction in worst-case error in my tests.
Error and performance statistics on the modified functions, here designated "abs". I tested on an x86_64 MacBook pro. My testbed tries every float within an exponential range rather than sampling randomly.