Incorrect result from to_f64 on 0.000875

Tim-Evans-Seequent commented 1 year ago

The first assert here passes, the second fails:

assert_eq!(
    BigDecimal::from_str("0.000875")
        .unwrap()
        .to_string()
        .parse::<f64>()
        .unwrap(),
    0.000875
);
assert_eq!(
    BigDecimal::from_str("0.000875").unwrap().to_f64().unwrap(),
    0.000875
);

I would expect to_f64 to give the same results as to_string followed by parse.

akubera commented 1 year ago

Offending line is: self.int_val.to_f64().map(|x| x * powi(10f64, self.scale.neg().as_()))

I suppose I shouldn't be surprised that a floating point operation shouldn't be used here.

To be explicit, this assertion passes as the second arg is equivalent to the closure above:

assert_eq!(
    BigDecimal::from_str("0.000875").unwrap().to_f64().unwrap(),
    875.0 * 1e-6
);

and this one fails, as you wrote above, as 875.0 * 1e-6 != 875e-6 in floating point

assert_eq!(
    BigDecimal::from_str("0.000875").unwrap().to_f64().unwrap(),
    875e-6
);

I guess more bit-manipulation is required to fix this, as floating-ops should not be trusted.

akubera commented 1 year ago

Starting with BigInt(875).to_f64().to_bits() => 0 10000001000 1011010110000000000000000000000000000000000000000000

The goal is 0.00875f64.to_bits() => 0 01111110100 1100101011000000100000110001001001101110100101111001

Break into floating point components: 0b10000001000 = 1032 0b1011010110000000000000000000000000000000000000000000 | (1<<52) = 7696581394432000

7696581394432000 * 2^(1032 - 1075) = 875.0 (as expected)

To multiply by 10^-6 = 5^-6 * 2*-6

7696581394432000 * 2^{1032 - 1075} * (5^{-6} * 2^{-6}) = 7696581394432000 * 5^{-6} * 2^{1026 - 1075}

Take our risk on floating point math again....

7696581394432000 * 5^{-6} = 492581209243.64794921875

492581209243.64794921875 => 0 10000100101 1100101011000000100000110001001001101110100101111000

and the mantissa part is one bit away from what we wanted ....

1100101011000000100000110001001001101110100101111000
1100101011000000100000110001001001101110100101111001

So that's a bummer

akubera commented 1 year ago

Ah:

BigDecimal::from_str("0.000875").unwrap().to_f64().unwrap().to_bits()
   => 0 011111101001 100101011000000100000110001001001101110100101111000

Which is the same number as the math above.

Yeah, current implementation is off by one bit.

akubera commented 1 year ago

7696581394432000 = 117440512000 << 16 + 0

117440512000 * 5^-6 = 7516192.767999999225139617919921875
  => 0 10000010101 1100101011000000100000110001001001101110100101111000

Shoot, I was hoping the smaller number would allow more digits in the product, but we get the same mantissa. But I guess that can't happen if we "just" shift by 16 bits, as that will only change the exponent (by 16).

akubera commented 1 year ago

If we have a 53 bit number we want to split into two smaller numbers, I think naturally we'd divmod(x, 1<<26), but we can't use power of two, so do we choose a number near $2^{26} = 67108864$, maybe add one 67108865?

>>> divmod(7696581394432000, 1+ (1<<26))
(114687998, 19529730)

or round up to 68000000?

>>> divmod(7696581394432000, 68_000_000)
(113185020, 34432000)

oh! or a power of 5? $5^{12}=244140625 ~= 2^{27.86}$

7696581394432000 = 31525197 * 5^{12} + 95603875

7696581394432000*5^{-6} =  31525197 *5^{6} + 95603875 * 5^{-6}

= 492581203125 + 6118.6480000000001382431946694850921630859375

492581203125 => 0 10000100101 1100101011000000100000101011001011010100000000000000
6118.6480 => 0 10000001011 0111111001101010010111100011010100111111011111001111

Finally, some new mantissas.

Actually, just adding those:

492581203125 + 6118.6480 = 492581209243.64801025390625
  => 0 10000100101 1100101011000000100000110001001001101110100101111001

gives the number we've been looking for (with the 1 bit at the end).

It's too late at night to figure out what to do with these, but I think I'm close.

akubera commented 1 year ago

For completeness.... the 10000100101 exponent in that result means $2^{1061-1075} = 2^{-14}$

And we have our $2^{-6}$ from the $10^{-6}$ of our bigdecimal scale.

So, with the original exponent of 1032 - 1075 = -43 the final exponent is -43-14-6 => -63

-63 + 1075 = 1012 = 0b01111110100

and with that we have reached our goal of f64 with bits: 0 01111110100 1100101011000000100000110001001001101110100101111001

akubera commented 1 year ago

Tracking fix in bugfix/to-f64

Works for normal values but needs more work to detect subnormal (exp-bits==0) or potentially out-of-bounds cases.

akubera / bigdecimal-rs

Incorrect result from to_f64 on 0.000875 #118