Refactor BigDecimal - Githubissues

straight-shoota commented 3 years ago

While looking into current issues around BigDecimal (#10502, #5714, #10599, #9547, #9578) it became apparent that the current implementation is lacking.

Our implementation is based on https://github.com/akubera/bigdecimal-rs but worse. And even that is IMO not the best role model.

A pretty extensive implementation with great documentation is Python's decimal library. Julia's Decimals.jl is another good example. There is actually a technical standard for decimal floating point numbers in IEEE 754.

A few observations about the data format in comparison with Python/Julia implementations:

BigDecimal#scale is an unsigned integer and represents the amount of digits the decimal point is shifted to the left, i.e. a negative exponent to the base of 10. The number is described by the formula value * 10 ** -scale. In the other definitions, the scale is a signed exponent and allows shifting the decimal point in both directions. Large numbers could already be expressed by a large enough BigInt value for the coefficient, but a signed exponent expresses precision. The formula is value * 10 ** exponent. So scale is the same value as exponent but with opposite sign.
In the other implementations the coefficient is always positive and a separate sign flag determines wether the value is positive or negative. The practical effect is that there are signed zero values (like in binary floating point types).
Similarly, special values Nan, +Infinity, -Infinity are missing in our implementation.

The specs for BigDecimal are pretty scarce for most methods, so I wouldn't be surprised to discover more bugs like (#10502).

The other Big* types are mostly wrappers for libgmp, so there's less chance for error on our side. BigDecimal is the only type implemented in Crystal.

asterite commented 3 years ago

It would be really great to have decimal implemented exclusively in Crystal. If that were the case, we could even introduce a literal for them. They are very useful, for example for representing money amounts. C# has a built-in decimal type.

HertzDevil commented 3 years ago

Finite precision, or arbitrary precision? (C#'s is the former, but it isn't one of the IEEE decimal formats either)

asterite commented 3 years ago

I don't know

straight-shoota commented 3 years ago

Finite precision floating-pount decimal numbers should be relatively trivial to implement.

For representing monetary values however, you would better use a fixed-point data type than floating-point (when precision is finite). There's no point in being able to represent values in larger orders of magnitudes, when the primary concern is to be exact down to a certain number of decimal digits (usually 4 in monetary applications). I don't see too much value for such a data type, at least not in stdlib. In the end, it's just the same as using an integer data type and interpreting the value 1 as a tenthousand's of the respective monetary base unit.

Implementing arbitrary precision floating-point decimal numbers essentially means we have to implement arbitrary precision integers (BigInt) in Crystal, too. That might be nice as a long term goal, but I don't think there's much value in that. Reusing an existing library like libgmp is fine and saves us a lot of trouble.

Sija commented 3 years ago

There's no point in being able to represent values in larger orders of magnitudes, when the primary concern is to be exact down to a certain number of decimal digits (usually 4 in monetary applications).

That's only correct for FIAT currencies. Cryptocurrencies can have as much as 18 decimals (see Ether for example).

HertzDevil commented 3 years ago

Apart from making scale signed we should also make it so that all BigDecimals are normalized, i.e. powers of 10 are always bookkept by scale, whereas value (which is really the mantissa or significand) is never divisible by 10, as if every constructor invokes factor_powers_of_ten. Other than that I think the current BigDecimal is actually fine.

I don't think BigDecimal needs to port all the features of IEEE 754 floats either.

straight-shoota commented 3 years ago

Technically, non-normalized values could be used to express precision. But we don't offer access to that and many implementations already apply normalization, so making it the default sounds good.

seyerian commented 3 years ago

I just created #11076 which is related to how BigRational is converted to BigDecimal using core methods. That's not a huge issue but I thought it was worth a mention here.

Are there any plans for BigDecimal beyond this GitHub issue? I'm interested in contributing here, if this is an appropriate place to start.

crystal-lang / crystal

Refactor BigDecimal #10641