Float printing is wrong for large values.

13419596 commented 2 years ago

Context

Floating point printing is incorrect for large values of floats. The values are well within the range of the floats, however an extraneous character is added to the beginning of the float result.

Issue Work-around

The issue seems to affect large floating point numbers. So if the floating point value for f16 is less than 2^11 and less than 2^53 for f32 and f64 the printed result will probably be accurate.

I have also supplied a function to facilitate printing somewhat accurate floats until this issue is resolved. The mantissa result will be somewhat close, but not totally correct because of rounding in the lower digits.

Odin report:

    Odin: dev-2022-09:1e595f2e
    OS:   macOS Monterey 12.5 (build: 21G115, kernel: 21.6.0)
    CPU:  ARM64
    RAM:  16384 MiB

Expected/Current Behavior

Here's a quick snippet from the unit test I wrote. For the larger value floats, the printed float is incorrect.

expected           : 2.344e3
odin printf        : 1.234e+03
expected          : 5.618e306
odin printf        : 4.579e+305
expected           : -5.618e306
odin printf        : -4.579e+305
expected           : 1.234e-300
odin printf        : 1.234e-300
expected           : -1.234e-300
odin printf        : -1.234e-300
work around result : -1.234e-300

Steps to Reproduce

Attached is a unit test to test float printing for the float types. This test relies on the common testing code from the odin repo (Odin/tests/common). It is possible to determine the leading digit for a power of two. The tests here use that to test that the floating point output is correct (up to the first digit) for the different floating point types.

// Must be run with `-collection:tests=` flag
package test_printf_float

import "core:fmt"
import "core:math"
import "core:testing"
import tc "tests:common"

log10_2 :: math.LN2 / math.LN10

get_float_parts :: proc(x: $T) -> (T, int) {
    // This will get you a close result, but may be off a bit because of rounding
    num_digits := int(math.log10(abs(x))+1)
    tens_power := num_digits -1 + (0 if num_digits>=0 else -1)
    tens_mantissa := x / math.pow(10., T(tens_power))
    return tens_mantissa, tens_power
}

get_leading_digits_powers_of_two :: proc(power: int, num_digits: int=1, $F:typeid) -> int {
    //  https://math.stackexchange.com/q/2006181
    x := F(power) * log10_2
  d := int(1. + x)
    exponent := x - F(d) + F(num_digits)
    out := math.pow(10., exponent)
  return int(out)
}

main :: proc() {
  t := testing.T{}
    test_printf_float(&t, 5, f16)
    test_printf_float(&t, 5, f16be)
    test_printf_float(&t, 5, f16le)
    test_printf_float(&t, 7, f32)
    test_printf_float(&t, 7, f32be)
    test_printf_float(&t, 7, f32le)
    test_printf_float(&t, 11, f64)
    test_printf_float(&t, 11, f64be)
    test_printf_float(&t, 11, f64le)

    fmt.printf("\n\nFloat printing workaround:\n")
    {
        f := f16(2344)
        m, e := get_float_parts(f)
        fmt.printf("expected           : 2.344e3\n")
        fmt.printf("odin printf        : %e\n", f)
        fmt.printf("work around result : %ve%d\n", m,e)
    }
    {
        f := f64(1<<1019)
        m, e := get_float_parts(f)
        fmt.printf("expected          : 5.618e306\n")
        fmt.printf("odin printf        : %e\n", f)
        fmt.printf("work around result : %ve%d\n", m,e)
    }
    {
        f := -f64(1<<1019)
        m, e := get_float_parts(f)
        fmt.printf("expected           : -5.618e306\n")
        fmt.printf("odin printf        : %e\n", f)
        fmt.printf("work around result : %ve%d\n", m,e)
    }
    {
        f := 1.234e-300
        m, e := get_float_parts(f)
        fmt.printf("expected           : 1.234e-300\n")
        fmt.printf("odin printf        : %e\n", f)
        fmt.printf("work around result : %ve%d\n", m,e)
    }
    {
        f := -1.234e-300
        m, e := get_float_parts(f)
        fmt.printf("expected           : -1.234e-300\n")
        fmt.printf("odin printf        : %e\n", f)
        fmt.printf("work around result : %ve%d\n", m,e)
    }
    fmt.printf("\n\n")
    tc.report(&t)
}

@test
test_printf_float :: proc(t: ^testing.T, num_exponent_bits:uint, $F:typeid) {
    // Tests positive powers of 2, when formatting floats. 
    // The leading digit can be determined analytically and thus compared to the printf'd result
    buf : [400]byte = {}
    buf2 : [400]byte = {}
    int2rune := [?]rune{'0','1','2','3','4','5','6','7','8','9',}
    start_value := 8.
    max_exponent := (1<<(num_exponent_bits-1))-1
    for n in 4..=max_exponent {
        expected_first_digit : int = get_leading_digits_powers_of_two(power=n, num_digits=1, F=F)
        result_string := fmt.bprintf(buf[:], "%v", math.pow(2., F(n)))
        result_first_digit := (int(result_string[0])-48)
        comparison := expected_first_digit == result_first_digit
        tc.expect(t, comparison, fmt.bprintf(buf2[:], "%T 2^(% 4d) expected:%v got:%v - string:\"%v\"", F{}, n, expected_first_digit, result_first_digit, result_string))
    }
}

graphitemaster commented 2 years ago

Odin's handling of floating point printing is very inaccurate around pathological values and is likely a consequence of just getting something that works for the common case and will likely need a new implementation if it wants to be round-trip accurate and conformant with IEC 60559 / IEEE 754 recommendations for conversion of strings to and from float (can contain no more than 1 ULP (units in last place) of error.) The minimum amount of precision needed to round results correctly for a 32-bit float (in base one billion representation) is 53 bits, and 106 bits for 64-bit which we obviously do not do.

Kelimion commented 2 years ago

I wonder how well Jeff Roberts' sprintf fares in this, and whether we could borrow its %f code.

13419596 commented 2 years ago

Odin's handling of floating point printing is very inaccurate around pathological values a

It's not really an accuracy thing. There's just an extra character at front of certain printed float values. The results would be close enough if not for the extra character.

graphitemaster commented 2 years ago

I wonder how well Jeff Roberts' sprintf fares in this, and whether we could borrow its %f code.

Definitely does not pass the requirements. To be fair, neither does MSVC's or glibc. The only libc that I know of that actually passes the requirements of being round-trip accurate is musl's which uses the b1b (base one billion) approach with excess precision for rounding. Oh and ARMs implementation in their C compiler but that's because nsz wrote them both and is like the only person qualified from what I can tell to get all the inane details correct.

Dial0 commented 2 years ago

Odin's handling of floating point printing is very inaccurate around pathological values a

It's not really an accuracy thing. There's just an extra character at front of certain printed float values. The results would be close enough if not for the extra character.

After 2^112 I think the whole number is wrong as the mant overflows here: https://github.com/odin-lang/Odin/blob/6fe1825db9350813a41627273268dc41983eb1f7/core/strconv/generic_float.odin#L43

numbers up to and including 2^112 can be corrected by using the correct value for capacity here: https://github.com/odin-lang/Odin/blob/6fe1825db9350813a41627273268dc41983eb1f7/core/strconv/decimal/decimal.odin#L138

odin-lang / Odin