johnmcfarlane / fixed_point

C++ Binary Fixed-Point Arithmetic
http://johnmcfarlane.github.io/fixed_point/
Boost Software License 1.0
231 stars 37 forks source link

Named arithmetic operator semantics #125

Closed timoalho closed 8 years ago

timoalho commented 8 years ago

The named arithmetic operator semantics aren't very intuitive. I'll use divide as an example, since the problem is sharpest there. Consider

using fp = fixed_point<int32_t, -30>;
fp x{1.0};
fp y{0.5};
auto result1 = y/x;
auto result2 = divide<fp>(y, x);

Intuitively, the expectation would be that result1 == result2 == 0.5. However, since divide is implemented as

return Result::from_data(
            _fixed_point_impl::shift_left<
                    (Lhs::exponent-Rhs::exponent-Result::exponent),
                    result_rep>(lhs.data()/rhs.data()));

the divide flushes and result2 == 0. Therefore, with this implementation, even dividing by one is not necessarily safe. Note also that the only way to have a non-zero result is to cast lhs to a type where its contents have non-zero bits at higher positions than rhs, which requires the user to know implementation details.

The named operators are supposed to provide complete control over the calculation, and I suppose this is why they're implemented such that they carry out the operation on the internal representations of Lhs and Rhs, and then cast the result as the desired output type. This specific implementation makes the most sense for multiply, where it's analogous to multiplying the corresponding integers and if it overflows, then it overflows.

I believe the issue highlighted here is that complete control over the calculation would involve control over not just the output type, but also the intermediate type used in the calculation. One straightforward but slightly ugly solution, would be to define something along the lines of

template<class Result, int IntermediateShift, class Lhs, class Rhs>
constexpr
typename std::enable_if<_fixed_point_impl::is_fixed_point<Lhs>::value &&
                _fixed_point_impl::is_fixed_point<Rhs>::value, Result>::type
divide(const Lhs& lhs, const Rhs& rhs)
{
...
}

where the representation of of lhs is left-shifted by IntermediateShift before carrying out the divide, with the necessary upcast to keep it from overflowing.

This is quite inelegant though, as it exposes the interface through an implementation detail, so I'm not suggesting that this is what should be done, but rather just illustrating the kind of direction that it may be necessary to think in.

Another solution, which would be optimal from an interface standpoint, but would remove the optimization advantages that are a part of the motivation of the named operators (?), would be to just define that divide<fpout>(lhs, rhs) produces the bit accurate result in terms of fpout, up to rounding, and if the final result does not overflow.

johnmcfarlane commented 8 years ago

The main purpose of the named functions - and more generally fixed_point is to keep track of the scale of the value. Nothing else really.

What if the named functions don't take a Result template parameter? What if the return type is the intermediate type? The intermediate type involves zero performance overhead compared to the underlying integer operation. So why perform a cast inside the function that could just as easily be performed inside it?

timoalho commented 8 years ago

That might work, yes. The slight inelegance is that that exposes implementation details (in principle, promoting to a larger type as an intermediate is not the only way to calculate, although that is what all implementations would probably do in practice). On the other hand, it's not too dissimilar to how the native types work, in the sense that an expression changes type according to the promotion rules.

Let me write out a couple of use cases where I'd concretely need these:

using dsp_t = fixed_point<int32_t, -30>; //The type used in internal calculations: 
using sample_t = fixed-point<int32_t, -14>; //The pre-quantization fixed point type for samples going to a DAC
dsp_t result = do_complicated_calculation(...);
//Now, need to convert first to sample_t, and eventually just int32_t in the range [0, 65536]
sample_t out1 = 65536 *  result; //overflows, since the result is cast to fixed_point<int32_t, -30>
sample_t out2 = multiply<sample_t>(65536, result); //overflows in the internal calculation
sample_t out3 = static_cast<sample_t>(multiply(65536, result)); //With the suggested implementation, multiply returns fixed_point<int64_t, -30>. Should work!
//Clamp, round if necessary and quantize down to int16_t
...

Currently I'm doing this with a hack that extracts rep and does the promotion and so on explicitly, this of course breaks the abstraction in fixed_point entirely.

The next one is not exactly what I'm talking about, but I believe this is related: constructing a fixed_point representing a fraction from two integers, which would introduce a new overload of divide:

int32_t value = ... //input from somewhere, always between 0 and range
const int32_t range = 65536; //Just an example of a range
using control_t = fixed_point<int32_t, -30>;
control_t c1 = value/range; //Always results in 0, or 1 if value == range.
control_t c2 = control_t{value}/range; //overflows if value > 1
control_t c3 = fixed_point<int32_t, -15>{value}/range; //should work, but wastes 15 bits of accuracy
control_t c4 = static_cast<control_t>(fixed_point<int64_t, -30>{value}/range); //Works, but requires knowledge of the promotion sequence, or using the internal widening casts in _fixed_point_impl
control_t c5 = divide<control_t>(value, range); //Imaginary overload, would be equivalent to above
johnmcfarlane commented 8 years ago

So correct me if I'm wrong but the first case seems to favor a revised multiply/divide API.

In the second case, I would initialize the value using:

auto c1 = control_t{ fixed_point<int32_t, -16>::from_data(value); };

c2 seems like incorrect usage because the temporary control_t object is initialized with a literal value in range [0..65536] when it's correct range is [0..1]. It is then corrected with the division. You'd get this problem with other scalar types, e.g. if you wanted to initialize a char with an int but it's a worse problem for fixed_point - mostly because users tend to saturate the range much more than they would with a raw integer and this leads to surprising overflow.

c4 is actually a on the right tracks. With both c4 and c5, if you want to avoid overflow, you need to manually widen the operand. With native integers, promotion saves the user from a lot of trouble this way but when the range is saturated and the types are already int, there is always a problem.

So bear in mind that my proposal for changing divide would replace - not overload - the existing function:

template<Lhs, Rhs>
auto divide(Lhs lhs, Rhs rhs) {
  using result = fixed_point<decltype(lhs.data() / rhs.data()), Lhs::exponent - Rhs::exponent>;
  return result::from_data(lhs.data() / rhs.data());
}

Given that API,

auto c4 = control_t{ set_width_t<control_t, 64>{value} / range };
auto c5 = divide(set_width_t<control_t, 64>{value}, range);

It's still not great but I'm liking the new solution so much, I think that I might apply it to arithmetic operators:

template <LhsRep, LhsExponent, RhsRep, RhsExponent>
auto operator*(fixed_point<LhsRep, LhsExponent> lhs, fixed_point<RhsRep, RhsExponent> rhs) {
  using result = fixed_point<decltype(lhs.data() / rhs.data()), LhsExponent - RhsExponent>;
  return result::from_data(lhs.data() / rhs.data());
}

And for multiple, divide etc. I perform the widening a lot like before so that they are the (potentially) slower, more fool-proof versions which take care of the set_width_t<control_t, 64> work.

Sorry to be moving so quickly on this but figuring out the correct return type for operator* and operator/ has been a huge pain-point and I feel like I'm starting to see the light at the end of the tunnel here.

timoalho commented 8 years ago

A couple of quick comments, I'll write more later: yes, the first case favors the revised API, at least over the current implementation. That was my point.

Absolutely, c2 is an abuse of the type, my point was just to kind of show the alternatives that might come to mind, and why each of them fails.

Your suggestion to initialize by

auto c1 = control_t{ fixed_point<int32_t, -16>::from_data(value); };

has some downsides. First of all, picking 65536 as the range was just an example, in my code I'm using a lot of ranges that are not powers of two. Second, and this is admittedly to some extent an aesthetic choice and therefore opinion-based, but I would argue that the less the user has to use from_data and .data(), the better, since whenever that happens the abstraction of fixed_point as a representation a subset of the reals leaks a bit. I realize that it's inevitable for that abstraction to leak at some points (it leaks for the integers too, when overflowing), I'm just saying it should be minimized.

Continuing from the above, I would say that making the operator* and operator/ -versions not promote the intermediate value would be a big mistake (if I understand correctly what you're suggesting): while I appreciate the spirit that the default choice is the zero-cost abstraction, it would become extremely taxing to reason about overflows. Consider that for fixed_point<uint32_t, -16>, multiplying two numbers that are both 1 (or larger) would overflow internally, which is very unintuitive for a type with a maximum value (just shy) of 65536. For fixed_point<int32_t, -30>, every pair of numbers larger than about 0.00006 (more precisely, any pair of numbers whose geometric average exceeds 2^(-14)) will overflow in multiplication. In my code, that would mean replacing every * with multiply and the set_width_t casts, making the library more verbose and probably about as error prone than just doing manual fixed point the old school way. For divide, it would mean that x/y = 0 whenever y > x, and would typically have a large error when x is about the same as y. For example, 1.5/1 would be 1.

Sorry to be pushing you on this, I totally understand your pain. I actually think the current version is right about all the other promotions (fp with int, fp with float, two identical fp's), and the difficult point is mixed exponent fixed point. And I see that it is a genuinely difficult problem.

johnmcfarlane commented 8 years ago

Re. the aesthetic choice, I rather like the from_data() choice for exactly the same reason. Consider the first two lines:

int32_t value = ... //input from somewhere, always between 0 and range
const int32_t range = 65536; //Just an example of a range

What is this in fixed-point terms? Not so much a leaky abstractions as a puddle!

Just as static_cast, const_cast were made deliberately ugly to deter use and telegraph risk so data and from_data does a similar job a the boundary of this abstraction. But yes, that's of little use if the scale isn't a power of two.

As for arithmetic operators, you've talked me out of it for now at least. But since pondering this choice, I noticed that a similar library that I mention in the proposal, @mizvekov's fp, does exactly the same thing!

johnmcfarlane commented 8 years ago

Actually, a little more context is required here. I'm not sure how familiar you are with the elastic<> type but I've been thinking about ways to represent a more general-purpose type using fixed_point<> and a custom integer type which counts its own bits. Consider:

template <typename Archetype, int NumBits>
class fine_grained_integer;
...
auto digit = find_grained_integer<int8_t, 4>{9};  // s4:0 number represented with a byte
auto double_digit = digit * digit;  // s8:0 number represented with an short

Here we have an integer type with bit-by-bit widening that doesn't follow native integer promotion rules.

Now when I compose,

template <typename Archetype, int NumBits, int Exponent>
using elastic = fixed_point<fine_grained_integer<Archetype, NumBits>, Exponent>;

my operator* has some nice properties:

auto n = elastic<int8_t, 7, -4>{7.9375};  // s3:4 number represented with a byte
auto nn = n * n;  // s6:8 number represented with a short

If you run your examples through a (theoretical) type like this, I think you'll find that most of the problems with fixed_point<int> go away. However, I'm not sure how I'd do this with the arithmetic operators as they currently stand.

johnmcfarlane commented 8 years ago

Two more thoughts I've had about my suggested change to the operator overloads:

  1. Currently, arithmetic is partially broken for 64-bit integers because it requires the intermediate to be 128-bits wide.
  2. Elasticity isn't achieved in the above example for division. I think division requires widening no matter what. And it's not just widening; it's widening of the fractional digits. Hmm...
timoalho commented 8 years ago

Regarding elastic: I'm not too familiar with it, but if I understand correctly it's a bit like a "compile time floating point" type, in the sense that it adjusts the internal representation and the number of integer and fractional bits based on the operations done?

The point on 64 bit integers being broken is a good one. Further, as long as there's a largest native type, you will have a similar problem. However, I don't think it outweighs the disadvantages of non-widening multiply. One possible way to go around that is to have the widest native type use a software emulated widening, i.e. a multiply will then cost two explicit multiplies, a number of shifts, and so on, and division will have to do long division. That would be quite inefficient, but on the other hand, I don't see 64-bit base types as the primary use case for fixed point? Also, it's quite laborious to implement, I guess.

Yet another point about the non-widening version: it makes it virtually impossible to write generic algorithms where the arithmetic type is a fixed_point, while having any sensible guarantee on correctness of results. What I mean is that for many algorithms, say FIR or IIR filters, it's possible to give guarantees that the output is correct as long as the input is smaller than the maximum representable number of the type, by some given factor. However, for fixed_point with non-widening multiply this would not hold true, but the specific exponents used would have an effect too (more than just the obvious effect on the representable integer range). Obviously a generic algorithm cannot used the named operators.

timoalho commented 8 years ago

I actually have a specific suggestion for the semantics of the named operators, although I'm not sure if you'll agree. There would be both a templated and non-templated version:

  1. The template version, for example multiply<fixed_point<Rep, Exp>>(a, b) will calculate the product of the fixed point numbers a and b and return the result as fixed_point<Rep, Exp>, doing whatever is necessary (in practice, widening the intermediate type) to guarantee that the product is accurate up to rounding (i.e. it will not round to closest unless the underlying type does so), unless the result overflows. This would be kind of a casting arithmetic operator, if you will. Similar semantics, i.e. template parameter defines return type and the result is accurate up to rounding, would hold for other operators.
  2. the non-template version would be multiply(a, b), which returns whatever fixed_point is the intermediate type used in the calculation. So for example fixed_point<int32_t, -15> * fixed_point<int32_t, -15> would give fixed_point<int64_t, -30>. This would basically have the idea of letting the internals leak a little, in order to allow more control.

I have a specific use case for 1. For this, assume that the default * -operator will keep the larger integer range when multiplying heterogenous exponents, which is what I would advocate and how my current private version behaves:

Now, while estimating the rate of an exponential decay (this is for controlling an analog envelope generator from an embedded system), at one point I end up with a number that is in the range 0 to about 40, so it is stored in fixed_point<int32_t, -25>. To get the final exponent, I multiply this with a compile time constant which is about 0.005, which is stored in fixed_point<int32_t, -38>. When I multiply them, the result will fit in fixed_point<int32_t, -30>. However, the * -operator will give me fixed_point<int32_t, -25>, losing five bits of accuracy. The current named operator won't help, as it will overflow in the intermediate stage.

With the current default for *, the use case would be for the opposite situation, where you have to first cast to a larger integer part to keep from overflowing.

For case 2. I don't have a clear use case, but the point is that that would allow flexibility to get the precise internal result and then do with it what you would like.

Admittedly, this does not provide a way to implement a non-widening multiply, expect by perhaps having a second template parameter as a tag or something. However, to be honest, I'm not quite sure of a use case for it anyway. Maybe if you really need the speed and only have a few fractional bits, which makes it easier to keep track of the point where the internal overflow happens.

johnmcfarlane commented 8 years ago

I took a look at providing overloads with two and three template parameters and it proved problematic. If you specify the first template parameter, the compiler has no way of knowing which of the two overloads you're calling. However, we could always give these more elaborate names.

More generally, there are two very different use patterns for fixed-point and they underpin a lot of the dilemmas I face in designing the API. Your case I call the 'saturated' use pattern because you are using the full capacity of the underlying type at any time. For you, the risk of overflow is ever-present when using int. When you are using raw integers, you must be having to manually widen your results in order to avoid overflow. In that respect, I don't see how you lose much functionality with a non-widening type. E.g.:

uint32 x, y;  // u16:16 fixed-point values
uint32 intermediate = x * y;  // causes overflow if sizeof(int)<=32, making intermediate uint64 doesn't help
uint32 product = intermediate >> 16;

Now consider a helper which widens for you:

template <typename Lhs, typename Rhs = Lhs>
using widen_t = set_width_t<Lhs, width<Lhs>::value + width<Rhs>::value>;

Now you can write:

uint32 x, y;  // u16:16 fixed-point values
auto intermediate = (widen_t<decltype(x)>{x} * y);  // avoids overflow on any architecture
uint32 product = intermediate >> 16;

This code should work if you replace uint32 with fixed_point<uint32, -16> and skip the >>16.

The other type of user will be operating on values with much smaller range where it is far easier to seamlessly avoid overflow. For them, elasticity is very attractive. It is also a more user-friendly type of number. Take a look at the examples in the "Extensible" section here. It also works in your case:

using elastic = fixed_point<fine_grained_integer<uint32, 32>, -16>;
elastic x, y;  // u16:16 fixed-point values
auto intermediate = x * y;  // fixed_point<fine_grained_integer<uint64, 64>, -32>
elastic narrow = intermediate;  // no overflow

Side note: x + y would be a fixed_point<fine_grained_integer<uint64, 33>, -16>. Here you are using a 64-bit integer to store 33 bits. However, you might consider this an advantage because it does avoid a real risk of overflow there too.

I appreciate that this might not convince you. I'm happy to stick with the existing API and maybe experiment with this new approach in a long-term branch. A major problem is that a non-widening type is pretty useless for division - as you have pointed out. I really don't know what to do about that.

timoalho commented 8 years ago

Indeed, I don't lose much (any) functionality with the non-widening type, if I'm willing to explicitly write widen_t everywhere. However, I would see the main benefit of a templated fixed point type being exactly that it frees me from having to specify such low level details, when they are not relevant. With the non-widening type, the only thing it would really help with would be not having to explicitly write/keep track of the shifts.

I'd also like to reiterate my point about generic algorithms, with a non-widening fixed point type there are practically no generic algorithms that do any arithmetic which could remain ignorant of the fact that they're being used with fixed point.

As for the operators with variable number of template arguments, I'm starting to like the term "arithmetic cast" for a combination of an arithmetic operation and changing the type. So how about multiply_cast<somereturntype>(a, b) for the casting version, and then multiply(a, b) would be the version that returns the wider intermediate type, and similarly of course for the other operations?

Am I correct in thinking that operations with elastic are almost the same as if I just worked with fixed_point and the multiply (and similar for other operations) suggested above, except that by declaring, for example, elastic<4, 4> I'm promising to use at most a limited number of bits for the integer part, therefore potentially saving some unneeded widenings?

johnmcfarlane commented 8 years ago

With the non-widening type, the only thing it would really help with would be not having to explicitly write/keep track of the shifts.

Again, that's the primary purpose of fixed_point<>. If you had to manually widen before when using raw integers, you'd have to manually widen now, but at least you don't need to keep track of the radix point. And for many users the non-widening version would support generic algorithms better because x * y * z does not cause a compiler error but instead overflows. Again, that is what happens currently if they use raw integers to do fixed-point arithmetic.

Note that it's only for types using >16 bits of precision that widen becomes absolutely necessary. And then only when using built-in integers - rather than elastic integers. However, that's still a lot of use cases. And elastic integers probably work well when operators are auto-widening.

I've been at work redsigning both operators and the named functions as you suggest. Take a look at the changes to the tests for a good idea of how usage would differ. Note that in this branch, operators do auto-widen. There are upsides and downsides to this approach so I'm certainly not completely sold yet but some of the results are better with either solution, e.g. the increased precision here.

As for multiply_cast<somereturntype>(a, b), or similar suggestions. This becomes a non-issue in my branch. The return type is determined entirely by the inputs for widening versions. All you can do is cast the result in such a way that you lose upper or lower bits.

Am I correct in thinking that operations with elastic are almost the same as if I just worked with fixed_point and the multiply (and similar for other operations) suggested above, except that by declaring, for example, elastic<4, 4> I'm promising to use at most a limited number of bits for the integer part, therefore potentially saving some unneeded widenings?

More like the opposite of this. Firstly, note that I'm going to replace elastic<IntegerDigits, FractionalDigits, Archetype> with fixed_point<elastic_integer<Archetype, Width>, Exponent> but they both behave very similarly.

For fixed_point<elastic_integer<int, 4>, -4> the type guarantees capacity for s4:4 and will use at least an int to store it. But auto-widening always happens. elastic_integer<int, 4>*elastic_integer<int, 4> --> elastic_integer<int, 8> and therefore fixed_point<elastic_integer<int, 4>, -4>*fixed_point<elastic_integer<int, 4>, -4> --> fixed_point<elastic_integer<int, 8>, -8>. This is true no matter whether named function and operator do widening and non-widening or vice versa because elastic_integer promotion doesn't stop at 4 bytes - unlike int. Also, if width > 31, then int64 becomes the underlying type but the Archetype is still int.

You still hit a compiler error when you exceed width of 64 bits. At that point either elastic_integer starts stringing machine words together or there needs to be another type, ever_growing_integer, which has >64-bit specialization. Boost.Multiprecision has such types. They could be used as elastic_integer except that they don't auto widen. So maybe then fixed_point<elastic_integer<ever_growing_integer<int>, Width>, Exponent> becomes the every-widening fixed-point type.

To summarize, there are three concerns necessary to achieve the generic, lossless solution you want:

  1. fixed-point,
  2. elasticity and
  3. unlimited capacity.

I would like for fixed_point to address the first of these concerns. I wish to propose elastic_integer to address the second. I'm not sure whether the third should be addressed also by elastic_integer or a third type but that is orthogonal to the design of fixed_point which is my current focus.

Auto-widening applied to operators is what I'm currently investigating. It has pros and cons when specializing with built-in types. It only gets you part of what you're after.

johnmcfarlane commented 8 years ago

This expression in your exp2 implementation is a very good example of code which doesn't do so great when arithmetic operators are auto-widening. I'm really not sure what to do about this code (especially as I don't fully understand it)! It's a great example because there is plenty of addition and multiplication combined together.

timoalho commented 8 years ago

Just to clarify, in case that we're having a misunderstanding here: I'm not suggesting that the default operators should widen the result type, only that they use a widened intermediate type (just as they do now!) for multiplies and divides. The only case where I'm suggesting more widening than now is the named operators (which currently never widen their intermediate type, to my understanding).

In this case, I don't see the problem with the expression you linked to, each of the multiplies widens to an intermediate int64, and returns an int32. That's of course not as fast as simply doing 32 -bit multiplies, but in this case every one of those would also overflow, so there's no going around it.

As a matter of fact, the named operator I've suggested would allow making use of those bits by using 32-bit multiplies but keeping them in a 64-bit accumulator, which depending on the platform would be marginally faster or slower (shift vs. add with carry and one more register in use), but would probably allow a bit accurate result.

The expression, by the way, is just evaluating a polynomial using Horner's rule, the polynomial being a minimax approximation of exp2 in the interval [0, 1].

I'll return to the rest later, hopefully I'll have a bit more time to look at this next week.

johnmcfarlane commented 8 years ago

I can easily swap the strategy for named functions and operator overloads. That is my preference. Currently neither strategy involves casting to a different exponent, e.g. multiply returns a value with summed exponents.

timoalho commented 8 years ago

Uh, no, what I'm advocating here is that both the named operators and the operator overloads should widen the intermediate type, with the named operators, as suggested above, also allowing either a specified return type (multiply_cast<ret_type>(a, b) and so on) or returning the widened type (just multiply and so on), while the overloads would return the input type if they have the same exponent.

In my suggestions, I haven't actually specified any method to do a calculation which doesn't widen the intermediate. However, as you seem to think that's also important, that could be a third form, multiply_fast or something, which may overflow even if the correct value would fit in the return type.

I haven't looked at the branch you linked to yet, sorry. Maybe I need to wait until I find the time to do so before continuing this discussion, so that I'll now what I'm talking about :)

johnmcfarlane commented 8 years ago

No worries. Right now, I'm trying to get the branch into a state where it's easy to switch between different approaches easily in order to weigh up the pros and cons of each. I'll let you know when it's in a stable state.

Re. non-widening arithmetic: I am keen to provide an API in which the user does not pay for what they don't use. For this reason, a non-widening variant is vital. (I feel that this should be the default choice but right now it is the named function behavior.) Consider:

fixed_point<uint8, -4> a, b;
auto c = a * b;  // fixed_point<uint32, -8>
auto d = c * c;  // fixed_point<???, -16>

What I am proposing is that on 32-bit machines, that ??? remains register-sized. If widening always happens, this last line increases from a single multiply into a multi-instruction, multi-register operation which does nothing to avoid overflow.

Re. widening arithmetic: In the widening case, multiply and divide are always lossless (as illustrated in the arithmetic operator tests in _fixed_pointcommon.h here). Therefore, there is little point specifying the return type because the user can cast or assign as they see fit.

fixed_point<uint8, -4> a, b;

// user choses to store in least memory possible
auto c = static_cast<fixed_point<uint16, -8>>(multiply(a, b));

// user chose to lose precision and risk overflow
auto d = static_cast<fixed_point<uint8, -4>>(multiply(c, c));

I don't see a compelling reason to reintroduce a third choice.

Re. returning the same exponent: This strategy works in limited cases and is disastrous in others:

fixed_point<uint32, 32> a, b;

// fixed_point<uint32, 32> more work is preformed and the result is completely flushed
auto c = a * b;

or

fixed_point<uint32, -64> a, b;
auto c = a * b;  // fixed_point<uint32, -64> same thing

These are relatively rare cases, with 100% data loss. But many common cases still incur some data loss.

This strategy also works badly with integers:

fixed_point<uint32, -32> a;
fixed_point<uint32, 32> b;

// one of these values is completely flushed
auto c = a * 1, d = b * 1;

In all of these cases the user has no way to avoid losing their data. Where widening can ease the problem, it doesn't make it go away. And either way, the computation is at least as costly if not more.

johnmcfarlane commented 8 years ago

Wow, has it been a month? Trying to find a good set of arithmetic operators and a new strategy for the elastic type has proved very challenging but with a little luck, I'm getting close to something acceptable. A brief summary:

  1. The named operators literally perform the given operation on the rep values and return it as whatever fixed_point instantiation is correct for that result.
  2. The plus and minus operators first promote to the type (if any) which fits the range (rather like the equality strategy I suggested) but they do not increment the range by one bit.
  3. The multiply operator adds the exponents - just like multiply - but also widens if necessary to fit the result.
  4. The divide operator widens the LHS to avoid that major underflow problem and the result is the same width as the sum of the operands.
  5. I've knocked up an elastic_integer type which works reasonably well with fixed_point with the caveat that it currently uses too much precision when scaling between rep and value. For this reason, it doesn't work well with 64-bit types unless __int128 is used.

I'm afraid exp2 also requires excessive width to complete currently. I expect this can be fixed but have not got completely familiar with this code to the necessary level yet.

Compared to my previous reply, this more-or-less reverses the role of named functions and operator overloads. The named functions do the minimum work possible and never widen beyond the implicit promotion rules of the rep type but also underflow and overflow quite easily. The arithmetic operators conversely are less prone to overflow and generally don't underflow but tend to use wider types more often.

I understand this isn't your preferred solution but it's relatively safe in general use (avoiding the pitfalls I mentioned last time), has the fall-back of named functions for super-lean computation (not to mention @= compound assignment operators) and starts to be usable with types that avoid the common causes over overflow and data loss altogether.

If you get time to take a look, I would very much appreciate your feedback: https://github.com/johnmcfarlane/fixed_point/pull/141

johnmcfarlane commented 8 years ago

Branch, #141 was merged a while ago. While this doesn't address the original issue, it provides an operator/ which does and makes it (hopefully) far less likely to experience the kind of gotchas mentioned, e.g. the .5 / 1. == 0.

I'd still be very interested to hear how you get on with this change if you decide to use it. There is always room for change - especially when based on feedback from user experience.

johnmcfarlane commented 8 years ago

Hi Timo, Just a quick update. I've tried seeing how the current API sits with me and I just cannot get comfortable with the idea of widening beyond int as the default. I've suspect that elastic_integer specialization seems to fit the bill for a lot of people, named functions can offer the middle ground where auto-widening is performed with built-in integer types and then finally for the performance junkie the default behavior with all its overflow problems.

With the possible exception of divide I'm not even convinced that there is a need for the named functions.

I'd still be interested in your thoughts on the elastic approach. (Also, I need a better name!)