Closed mdragon159 closed 3 years ago
There are certain operations (like multiplication and division) that could overflow in the intermediate result, even though the result would be representable in the BaseType. You can look at the implementation of e.g. operator*
.
The background is as follows: a fixed-point number can be thought of as the "real" number multiplied by a constant (e.g. for F16.16, that would be 2^16, or 65536). So the number 1
is stored as 65536 and 1.25
is stored as 81920. Let's call this constant F from now on.
So multiplying two fixed-point numbers A (A_real * F
) and B (B_real * F
) results in A_real * B_real * F * F
. But we want it to be A_real * B_real * F
, so we have to divide the intermediate result by F (and deal with rounding). But depending on A and B, that intermediate result could overflow the 32-bit BaseType, even though the result (A_real * B_real * F
) would not. Hence the need for a larger IntermediateType. There's some other places where this is useful, too (e.g. printing numbers). Of course, for specific inputs, this may not be necessary.
About your use case, let's look at your example of a range of [-20000,+20000] (sidenote: 2km is 200,000cm, not 20,000cm).
So, F16.16 would fit that range, and give a resolution of 153nm. I think for a normal game that resolution is a bit overkill. If your physics update runs at 60Hz, a resolution in the order of 1/120cm should be sufficient, so you could even pick F24.8 as type and get 1/256cm.
As you pointed out, the moment you start doing stuff with these numbers, you may have to convert to another type.
You're right about the range required when squaring. A max range of 400,000,000 doesn't fit in F16.16 (or F24.8), so I would indeed move to a 64-bit BaseType. As explained above, multiplication requires an extra F
in the range, so for F16.16, squaring 20,000 results in an intermediate (raw) value of 20000 * 20000 * F * F
, which fits in 2^61. So you should be fine with a 64-bit IntermediateType, in practice.
However, if your limit is actually 2km (or 200000cm) it no longer fits and you need a 128-bit IntermediateType. Unless you use F24.8 as base type, because the max intermediate value is then 200000*200000*256*256
, which fits in 2^52.
Note that if you do use F24.8 as base type, the moment you do trigonometry on these numbers, you probably want to convert it to F16.16 first for better accuracy.
This is unfortunately the downside of fixed-point numbers. You have to be really mindful of range everywhere, and this may require copious conversions between different fixed-point types.
Please let me know if this explanation helped.
Thank you for the in depth explanation, this really helped! I see what I've been missing with the underlying calculations and the fraction portion - quite frankly - is far less intimidating and mysterious than it was before. You even went so far as to explain the resolution in meters as well as an approach for dealing with lower accuracy with, say, F24.8- both concerns I would have pondered in the future and have had to slowly figure out =)
To emphasize, before your answer the limitations and worries of fixed points seemed extremely vague and abstract to me (even after reading a few relevant articles and such). However, now I have a far better grasp on them such that I can better design around their limitations as well as build better tests and such to detect + handle overflows and too large of a drop in accuracy.
Once again, thank you very much for all of this ^_^
(And tyvm for pointing out that 2km == 200k units! Thankfully I'm not creating an open world game and thus can better keep in mind the technical limitations with the game design. Definitely doable and feeling far more comfortable!)
Question for your expertise for what seems to be an uncommon-but-not-rare situation:
What are the risks and limitations exactly of using the same int type for BaseType and IntermediateType? Eg,
fpm::fixed<std::int64_t, std::int64_t, 16>
I've looked into the fpm code, ran some unit tests, and otherwise messed around with the above for a while, and it seems like the above 64bit with 16 fractional bits type works pretty well. However, I'm not familiar with how the fractional portion works exactly and overall not sure what the limitations and risks of this would be (other than needing to restrict myself to some sub-range of numbers at the risk of very easy underflows and overflows).
Problem Context: Earlier I attempted to a BaseType of int64_t and IntermediateType of boost's int128_t, but unfortunately that's far too slow for my needs (10x slower than fixed_16_16 and far higher than my performance budget as mentioned in #21). Thus, I'm looking for workarounds for my size constraints
Notably, I don't really need a very large working number range- I'm creating a game where typically one unit == 1cm (Unreal units), and as long as I get at least 2km (so 20000) for range of world units then I can easily fit all my needs.
However, that 20,000 max is before any underlying calculations. For example, one of my more problematic operations is getting a length of a vector, which involves squaring and adding 3x. Thus, at the very least my range needs to support 20000^2, and given that's above the range of fixed_16_16 and fixed_24_8, I need to go into the realm of 64bit BaseType numbers.
And now we're unfortunately back to the beginning where a BaseType of int64_t needs a size above it, and boost's int128_t is unfortunately extremely slow.
However... my numeric range isn't that large. Thus, if I drop the need for lossless operations and heavily restrict myself to a very limited range (which will definitely be error-prone but doable), then I can use the same IntermediateType as BaseType. In other words, it should be viable to use
fpm::fixed<std::int64_t, std::int64_t, 16>
(or something akin to it).I've tested this (after commenting out the sizeof static assert), and after some tweaking with number of fractional bits, it's surprisingly working well. On the other hand, I'm not sure exactly what the risks and limitations are.
For example, I'd expect that I'd need a working range that fits the following: [sqrt of range<TOTALBITS - FRACTIONALBITS>]/3
Thus, with
fixed<int64_t, int64_t, 20>
, I'd expect the safe numeric working range to be about... sqrt(2^(64-20)-1)/3 = ~1,398,101.3However, even 45000 * 45000 overflows with that fixed point type, while
fixed<int64_t, int64_t, 16>
does not overflow here.All in all, I'm largely guess and checking with the limitations of this approach at the moment, and thus would greatly appreciate any insight you can provide here.
In addition, I hope this write up will also help someone in the future who faces similar issues as well. Cuz dang, this was a surprising amount of work to even get this far. Makes me hella glad this solid library exists again- yay for well-written and flexible yet easy to understand library!