switch out fpuPow variants for fpuExp2

logicomacorp / WaveSabre

Official WaveSabre repository

MIT License

246 stars 34 forks source link

switch out fpuPow variants for fpuExp2 #37

Closed kusma closed 3 years ago

kusma commented 4 years ago

It turns out, we're don't really need the full flexibility of Pow(x, y). Instead, what we need is these:

Pow(x, 2)
Pow(x, 4)
Pow(2, x)
Pow(10, x)

The first two are trivially expressible as normal C++ floating-point multiplies. This seems to end up with exactly the same code-size, both before and after compression.

Number four can also trivially be expressed in terms of number three. This means we're left with an exp2-variant. And Exp2 is a bit easier to handle than Pow, especially due to less needed special-cases.

The end result is a saving of 36 bytes.

A follow-up question to this series might become: do we really need both float and double variants of these? But I guess @yupferris knows more about the rationale here...

kusma commented 4 years ago

RE precision, I think we can address that in a separate patch, as it requires some exploration. The short answer is "I don't really know if we need extra precision" - I think my rule of thumb has been anything relating to phase/frequency should generally have double precision, but that may have been misguided and could certainly be revisited.

Right. It would be interesting to see if we could try to completely butcher the precision and see how bad it gets. I have a hunch that fp32 is plenty, but if we only compare float do double, it could be hard to tell in a proper way. I often try to remove bits until I can clearly tell the difference, and then add back a few bits for good measure. In this case we only want either 32-bit or keeping 64-bits, but perhaps knowing how many bits are actually useful would be good information?

yupferris commented 4 years ago

I think you're right for the most part. One area of concern is phase precision which I believe can be offset by global song time in seconds, and I think even after a couple minutes there can be noticeable precision loss in 32 bits. For LFO's/oscillators I imagine this being significant.

Otherwise, for pitch (and most areas where we actually use pow) it would probably be fine to do that part of the calculation in 32 bit precision and convert to 64 before accumulating or something.

kusma commented 4 years ago

I think you're right for the most part. One area of concern is phase precision which I believe can be offset by global song time in seconds, and I think even after a couple minutes there can be noticeable precision loss in 32 bits. For LFO's/oscillators I imagine this being significant.

Right, but you just need to do the bias in 64-bit, calculating the pitch in 64-bit shouldn't be needed for that, I think.

kusma commented 4 years ago

Hmm, fails now with a missing symbol "___libm_sse2_log", which I can't reproduce locally..

kusma commented 4 years ago

I guess this means log(10.0) / log(2.0) for some reason didn't constant fold... Hmm..

kusma commented 3 years ago

OK, let's try with some math-constants instead, should be less fragile.

kusma commented 3 years ago

Yeah, that did the trick :)