lib/util: Implement %f to print floating point numbers

titzer commented 2 years ago

Since adding floating point back in 2020, the Virgil utility libraries have not yet added support for printing/rendering floating point numbers (in decimal). In C, the printf format string supports %f for specifying an argument is a floating point number. The analogous place to add code to handle this is in StringBuilder, which can deal with floating point strings.

Slight differences with C printf and Virgil StringBuffer:

StringBuilder uses %d (for decimal) for specifying decimal output of integer values. I think it'd be nice to just extend
StringBuilder doesn't yet support specifying the width (in characters) of the output item, nor left or right-justifying
Performance of printing floats isn't highly important; as long as the routine doesn't allocate memory, it should be fine to do the naive algorithm

I could use some help on this, and it might make a good starter project for someone new wanting to kick the tires with Virgil and contribute.

srackham commented 2 years ago

I've got an implementation working but need some input:

printf's %f renders the fractional part rounded to 6 decimal places with trailing zeros. I think a more readable default format would be rendering to 6 places but truncating trailing zeros after the first decimal place? For example: this 1.0 and not this 1.000000.
Floating point integer parts become unsafe after these values:
```
def MAX_SAFE_DOUBLE = 9007199254740991.0;   // 2^53 - 1
def MAX_SAFE_FLOAT = 16777215.0f;       // 2^24 - 1
```
See, for example, https://stackoverflow.com/a/1848762/1136455

The question becomes how to report unsafe conversions? My current solution is to render the string !UnsafeConversion instead of the number.

titzer commented 2 years ago

Great that you are working in this!

I haven't quite thought through all of the issues here, so happy to have your input.

I was thinking that %f would format floating point values according to the default Java rules which generate the shortest literal string that when parsed and rounded according to the existing string -> float/double rules will produce the exact same floating point bits as the input. Thus (modulo NaNs), parse(print(x)) == x.
Right. Virgil's rules for floating point cast/query from int -> float already respect those limits, so it's built into a check like float.?(x) that it will do a range comparison as well as a trial rounding to make sure the input isn't changed. For small enough integer types, such casts or queries can't fail. Vice versa, a query int.?(x) will check a floating point number if it is in range and can be converted without rounding. That means a cascade of type queries will "do the right thing" if the queries are ordered from the more specific (small ints) to the more general (larger ints, floats). Incidentally, it means that floating point values that happen to represent integer values will already work properly today, as int.?(float) will succeed.

Given that, I think it'd be great if

%d (decimal) also works for floats. Two options there would be to always use a decimal point (e.g. 1.00078, 100093928.092, -0.00000000938) but enough digits to get the property in (1) above, or to allow scientific notation 3e466, because always having a decimal might be hundreds of digits long.
%f did the "Java thing" and always produced the shortest string that gives the right bits.

IIUC your question (2) is what to do if %f is fed an actual integer that could be out of range? If we just relied on the type query cascade, that would fall off the end, because float.? and double.? would fail. That already will create a dynamic error. That'd be OK. We could also just say that those unrepresentable integers will just be printed as decimal, and if the user used that string as a floating point literal, it'd be rounded according to the existing rules. That's probably a little more user-friendly.

titzer commented 2 years ago

Another thought is that having control over the number of decimal places printed is very useful too, so it might make sense to support the printf-style %.Nf format-modifier. (Maybe in this case it'd be %.Nd?)

srackham commented 2 years ago

Yes, that makes sense (the devil is always in the details though).
I was unaware of that functionality, it's very clever. There's a lot more to Virgil's number types than first meets the eye.

In the light of your comments there's something I just don't understand: if putd is called with a float argument 16777217f (MAX_SAFE_FLOAT+2) it prints the incorrect value 16777216 and does not throw a compile-time error:

$ cat t.v3
def main() {
        var b = StringBuilder.new();
        b.putd(16777217f).ln(); System.puts(b.extract());
}

$ v3i t.v3 $VIRGIL_LOC/lib/util/*.v3
16777216

titzer commented 2 years ago

What's going on with 16777217f is that that literal is being rounded at compile time when parsed from a string literal to a floating point value. The same thing happens in Java:

class Put {
    public static void main(String[] args) {
    float f = 16777217f;
    System.out.println("" + f);
    }
}

And the output:

% java -cp . Put
1.6777216E7

srackham commented 2 years ago

Thanks for clearing that up.

titzer commented 2 years ago

For reference on how this might be implemented, here is the Scala port of the relatively-new Ryu algorithm:

https://github.com/scala-native/scala-native/pull/1436

For anyone willing to pitch in on this issue, that's the "swinging for the fences" version. I think we could settle for something that is shorter and easier to read that gives the same results but might be ~4x slower, thus leaving the Ryu port for future work.

srackham commented 2 years ago

Just after my last post around a month ago I dug a little deeper and realised that my "implementation" was naive to the point of embarrassment, then the World intruded and I laid to to one side.

A real implementation is above my pay-grade and I just don't have the motivation or time to wrangle IEEE 754.

Here's my code plus tests plus Makefile.

component RenderFloats {

    def MAX_SAFE_DOUBLE = 9007199254740991.0;   // 2^53 - 1
    def MAX_SAFE_FLOAT = 16777215.0f;           // 2^24 - 1

    // Render double to rounded fixed precision decimal number.
    def renderDoubleFixed(val: double, precision: int) -> string {
        def buf = StringBuilder.new();
        if(val.sign == 1) buf.putc('-');
        val = double.abs(val);
        def fixed = double.floor(val);
        if (fixed > MAX_SAFE_DOUBLE) return "!UnsafeConversion";
        var rounding = 0.5;
        for (i < precision) { rounding /= 10; }
        val += rounding;
        buf.putd(fixed);
        if (precision > 0) buf.putc('.');
        var fract = val - fixed;
        for (i=0; i < precision; i++) {
            var digit = double.floor(fract *= 10);
            buf.putd(int.!(digit));
            fract -= digit;
        }
        return buf.toString();
    }

    // Render double rounded to 6 decimal places.
    // Trailing zeros after the first decimal place are truncated.
    def renderDouble(val: double) -> string {
        def s = renderDoubleFixed(val, 6);
        if (s[0] == '!') return s;
        var i = s.length - 1;
        if (s[i] != '0') return s;
        while (s[i-1] != '.' && s[i] == '0') i--;
        def b = StringBuilder.new();
        for (j < i+1) b.putc(s[j]);
        return b.toString();
    }

    def renderFloatFixed(val: float, precision: int) -> string {
        def fixed = float.floor(float.abs(val));
        if (fixed > MAX_SAFE_FLOAT) return "!UnsafeConversion";
        return renderDoubleFixed(double.!(val), precision);
    }

    def renderFloat(val: float) -> string {
        def fixed = float.floor(float.abs(val));
        if (fixed > MAX_SAFE_FLOAT) return "!UnsafeConversion";
        return renderDouble(double.!(val));
    }

}

The tests:

def ERRMSG = StringBuilder.new();

def main() -> int {
    testRenderDoubleFixed();
    testRenderDouble();
    testRenderFloatFixed();
    testRenderFloat();
    def msg = ERRMSG.extract();
    System.puts(msg);
    return if(msg.length == 0, 0, 1);
}

def testRenderDoubleFixed() {
    def cases: Array<(double, int, string)> = [
        (0.0, 0, "0"),
        (0.0, 1, "0.0"),
        (3.141592653589793238, 0, "3"),
        (3.141592653589793238, 1, "3.1"),
        (3.141592653589793238, 3, "3.142"),
        (3.141592653589793238, 6, "3.141593"),
        (RenderFloats.MAX_SAFE_DOUBLE, 2, "9007199254740991.00"),
        (RenderFloats.MAX_SAFE_DOUBLE+1, 6, "!UnsafeConversion"),
        (-0.0, 0, "-0"),
        (-0.0, 1, "-0.0"),
        (-3.141592653589793238, 0, "-3"),
        (-3.141592653589793238, 1, "-3.1"),
        (-3.141592653589793238, 3, "-3.142"),
        (-3.141592653589793238, 6, "-3.141593"),
        (-RenderFloats.MAX_SAFE_DOUBLE, 2, "-9007199254740991.00"),
        (-(RenderFloats.MAX_SAFE_DOUBLE+1), 6, "!UnsafeConversion")
    ];
    for (t in cases) {
        testEqual(t.2, RenderFloats.renderDoubleFixed(t.0, t.1));
    }
}

def testRenderDouble() {
    def cases: Array<(double, string)> = [
        (0.0, "0.0"),
        (1, "1.0"),
        (10, "10.0"),
        (0.1, "0.1"),
        (1.12, "1.12"),
        (1.1234567, "1.123457"),
        (RenderFloats.MAX_SAFE_DOUBLE, "9007199254740991.0"),
        (RenderFloats.MAX_SAFE_DOUBLE+1, "!UnsafeConversion"),
        (-0.0, "-0.0"),
        (-1, "-1.0"),
        (-10, "-10.0"),
        (-0.1, "-0.1"),
        (-1.12, "-1.12"),
        (-1.1234567, "-1.123457"),
        (-RenderFloats.MAX_SAFE_DOUBLE, "-9007199254740991.0"),
        (-(RenderFloats.MAX_SAFE_DOUBLE+1), "!UnsafeConversion")
    ];
    for (t in cases) {
        testEqual(t.1, RenderFloats.renderDouble(t.0));
    }
}

def testRenderFloatFixed() {
    def cases: Array<(float, int, string)> = [
        (0.0f, 0, "0"),
        (0.0f, 1, "0.0"),
        (3.141592653589793238f, 0, "3"),
        (3.141592653589793238f, 1, "3.1"),
        (3.141592653589793238f, 3, "3.142"),
        (3.141592653589793238f, 6, "3.141593"),
        (RenderFloats.MAX_SAFE_FLOAT, 2, "16777215.00"),
        (RenderFloats.MAX_SAFE_FLOAT+1, 6, "!UnsafeConversion"),
        (-0.0f, 0, "-0"),
        (-0.0f, 1, "-0.0"),
        (-3.141592653589793238f, 0, "-3"),
        (-3.141592653589793238f, 1, "-3.1"),
        (-3.141592653589793238f, 3, "-3.142"),
        (-3.141592653589793238f, 6, "-3.141593"),
        (-RenderFloats.MAX_SAFE_FLOAT, 2, "-16777215.00"),
        (-(RenderFloats.MAX_SAFE_FLOAT+1), 6, "!UnsafeConversion")
    ];
    for (t in cases) {
        testEqual(t.2, RenderFloats.renderFloatFixed(t.0, t.1));
    }
}

def testRenderFloat() {
    def cases: Array<(float, string)> = [
        (0.0f, "0.0"),
        (1f, "1.0"),
        (10f, "10.0"),
        (0.1f, "0.1"),
        (1.12f, "1.12"),
        (1.1234567f, "1.123457"),
        (RenderFloats.MAX_SAFE_FLOAT, "16777215.0"),
        (RenderFloats.MAX_SAFE_FLOAT+1, "!UnsafeConversion"),
        (-0.0f, "-0.0"),
        (-1f, "-1.0"),
        (-10f, "-10.0"),
        (-0.1f, "-0.1"),
        (-1.12f, "-1.12"),
        (-1.1234567f, "-1.123457"),
        (-RenderFloats.MAX_SAFE_FLOAT, "-16777215.0"),
        (-(RenderFloats.MAX_SAFE_FLOAT+1), "!UnsafeConversion")
    ];
    for (t in cases) {
        testEqual(t.1, RenderFloats.renderFloat(t.0));
    }
}

def testEqual(expected: string, got: string) {
    if (!Strings.equal(expected, got)) {
        ERRMSG
            .puts("FAILED: expected: ")
            .puts(expected)
            .puts(": got: ")
            .puts(got)
            .ln();
    }
}

Makefile:

MAKEFLAGS += --warn-undefined-variables
SHELL := bash
.SHELLFLAGS := -eu -o pipefail -c
.DEFAULT_GOAL := all
.DELETE_ON_ERROR:
.SUFFIXES:
.ONESHELL:
# .SILENT:
.PHONEY: run build tags

SRC = FloatToString*.v3 /home/srackham/local/bin/virgil/lib/util/*.v3

run:
    virgil run $(SRC)

build:
    v3c-x86-linux $(SRC)

tags:
    vctags $(SRC)

titzer / virgil

lib/util: Implement %f to print floating point numbers #74