ionspin / kotlin-multiplatform-bignum

A Kotlin multiplatform library for arbitrary precision arithmetics
Apache License 2.0
353 stars 41 forks source link

BigDecimal floatValue and fromFloat have same bug as doubles had in issue #130 #134

Closed skolson closed 3 years ago

skolson commented 3 years ago

See the description of issue #130, the same exact bugs are in floatValue and fromFloat. Sorry I should have caught this during the issue 130 work, Anyway while unit testing the fix to the two float functions I hit a different bug/feature and am unsure which it is - bug or feature. Given this code snippet:

        f = BigDecimal.fromFloat(Float.MIN_VALUE)
        assertEquals(Float.MIN_VALUE, f.floatValue(true))
        val f2 = f.divide(BigDecimal.TEN)

The value of Float.MIN_VALUE is 1.4E-45. As expected f2 is 1.4E-46. But here is the state inside f2:

f2 = {BigDecimal@2111} "1.4E-46"
 precision = 11
 precisionLimit = 0
 roundingMode = {RoundingMode@1966} "NONE"
 significand = {BigInteger@1937} "14000000000"
 exponent = -46
 scale = -1
 usingScale = false
 decimalMode = {DecimalMode@1903} "DecimalMode(decimalPrecision=0, roundingMode=NONE, scale=-1)"

is the precision number and the significand correct in this case? Or should all those trailing zeroes have been trimmed and the precision set to 2? If this is correct behavior lemme know, because there is current code in the narrowing stuff that expected this result to have precision = 2 and significand = 14. Because it doesn't, the precision limit check in floatValue() is incorrect. And it likely there are cases where the doubleValue() precision limit check is incorrect for the same reason. Both use the precision value in the same way as a limit check.

Let me know what the correct behavior of precision and significand is in the above case, and I'll proceed from there to get fixes and unit tests developed for the above.

ionspin commented 3 years ago

Hi Steven,

First to answer your question, what you are seeing in your example is correct behavior, precision is always the number of digits in significand.

Offtopic:

Also let me explain where does that 14000000000 significand after division actually come from, it's not too relevant for this issue, I'm just using the opportunity to put the algorithm into words, you can skip it completely if you want, I'll also put it in a wiki.

This only applies for unlimited precision, which is default behavior when no decimal mode is specified.

In all operations apart from division, we don't really care about preparing operands, because whatever combination of significands we have, the result of applying subtraction, addition or multiplication is going to be correct.

i.e.:

val a = "123".toBigDecimal() // significand = 123, precision = 3, exponent = 2 so 1.23E+2
val b = "456".toBigDecimal() // significand = 456, precision = 3, exponent = 2 so 4.56E+2

the multiplication is then executed as:

Take the significands and multiply them: 123*456 = 56088, 5 digits so the precision is going to be 5 in the end
Take the exponents and add them 2 + 2 = 4
Resulting BigDecimal: 5.6088E+4, significand = 56088, exponent = 4, precision = 5

There`s a bit more code to handle exponent in some corner cases but I'll skip that here

The exception is division, because significands are integers, they only support integer division, which creates two cases we need to worry about.

Case 1, division resulting in a remainder:

val a = Float.MIN_VALUE.toBigDecimal() // Creates a big decimal with significand 14
val divided = a / 10

If we didn't do any preparation of operands in division step and just let the BigInteger do the division the result would be 14 / 10 = 1 with a remainder 4 so we would get 1 as an answer and then need some special handling to continue dividing by using the remainder.

So to simplify the case when division results with a remainder we increase the precision of the dividend big decimal by creating a new big decimal by raising it to a certain power.

Currently the formula for that preparation is:

    val power = (other.precision * 2 + 6)
    val thisPrepared = this.significand * BigInteger.TEN.pow(power)

So what we created in the end is the prepared umber that has precision of firstOperand + (secondOperand * 2 + 6) in out example we had a precision of 2 for first operand and 2 for second operand so the prepared operand had precision of 12 (number of digits in 140000000000) Then if we divide by 10 we get 14000000000 which is the number you are seeing in your tests. The result is always correct, it's just that the precision has been increased.

Case 2, infinite division

This one is actually more important, consider the following case:

val a = 1.toBigDecimal()
val b = a / 3

The result of that division is mathematically 0.333... repeating to infinity, and we cannot really calculate that or represent it in memory. Because we are in unlimited precision mode we are claiming to the user that we will return a correct result no matter what are the operands. So to handle these situations we use the same approach as in case 1, we expand the precision of dividend by using the aforementioned formula and execute the division, in this case it would look like this

100000000 / 3 = 33333333 with remainder 1

And that remainder is telling us that we are in a situation where we cannot guarantee that the operation is going to end in finite time, so we throw an exception and ask the user to specify the decimal mode to his own satisfaction, here is the snippet that does that:

if (divRem.remainder != BigInteger.ZERO) {
                throw ArithmeticException("Non-terminating result of division operation " +
                        "(i.e. 1/3 = 0.3333... library needs to know when to stop and how to round up at that point). " +
                        "Specify decimalPrecision inside your decimal mode.")
            }
skolson commented 3 years ago

Cool. I was thinking of precision as number of "significant digits" in the mantissa, irrespective of how the mantissa got calculated, and in the BigDecimal, the trailing zeroes are not significant (unlike BigInteger or any operations on the significand BigDecimal owns). I'm guessing other people could make the same assumption, so this might be an ongoing source of questions :-)

But since precision can do this, I'll come up with a different way to detect loss of precision during narrow operations

ionspin commented 3 years ago

That is a good point and I agree, I'll clarify that in the development wiki.

skolson commented 3 years ago

Oops, didn't get to this today, and am gonna be gone for a while, will do a pull request in a couple weeks.

ionspin commented 3 years ago

No rush, I'll have a crack at it next weekend in I get some free time.

skolson commented 3 years ago

Hey, I'm back from my trip and later today will get up to speed with the work you've done. Thanks!!

ionspin commented 3 years ago

Hey Steven, welcome back! I've added the check if the number can be exactly represented by converting the significand to IEEE 754 binary format. Then I count the number of bits in the resulting number and fail if it's more than 23 for float or 53 for double.

I think that that will not cover all the cases though. I am fairly certain that we need to add another check for integer precision, because looking at the wiki article here https://en.wikipedia.org/wiki/Single-precision_floating-point_format at the bottom it refers to integer precision, and currently we don't actually check that at all.

Also I found out that MIN_VALUE for both double and float, actually cannot be represented in binary format, but that's not immediately apparent because toString does rounding for automatically. You can see it discussed here https://stackoverflow.com/a/12128867/196263

For example MIN_VALUE for float is not 1.4E-45, but 1.40129846432481707092372958328991613128026194187651577175706828388979108268586060148663818836212158203125E-45

On the other hand, when BigDecimal is created from float, it will actually parse floats toString output, so the BigDecimal will be precisely 1.4E45, and when narrowing the test to see if it can be converted to float will fail (because it really can't). So I removed those cases from the tests.

ionspin commented 3 years ago

I think that that will not cover all the cases though. I am fairly certain that we need to add another check for integer precision, because looking at the wiki article here https://en.wikipedia.org/wiki/Single-precision_floating-point_format at the bottom it refers to integer precision, and currently we don't actually check that at all.

Although on other hand, maybe the current check with bit counting actually covers that as well, I'll need to add some test cases to check for that.