var_to_str rounds floats, losing massive precision in the process

mgc8 commented 1 year ago

Godot version

4.0.3.stable.official.5222a99f5

System information

MacOS Ventura 13.4, M1 Max

Issue description

When converting a float variable to string for serialisation using var_to_str, the expectation would be that the resulting string is simply the exact same variable as a string, e.g. 1.15 becoming "1.15".

This happens correctly for small values, but even as soon as the value goes above four digits, the result is rounded up/down. For over six digits, even the integer part is reduced to zero:

print(var_to_str(12345.54321)) => 12345.5
print(var_to_str(123456789.54321)) => 1.23457e+08 => which is 123457000 (!)

A certain amount of precision loss would be expected with floating point numbers, but not at these small values and so egregiously.

This is particularly problematic when trying to serialise time values as unix timestamps, which are by default all floats; it can be worked around by casting everything to "int" first, but that looses the sub-second precision.

Steps to reproduce

This exemplifies the problem by comparing integer and float serialisation/de-serialisation for the same values:

    var x_int = {
        "start": int(Time.get_unix_time_from_system()),
        "end": int(Time.get_unix_time_from_system()) + 15,
        "message": "test int",
    }
    var x_float = {
        "start": Time.get_unix_time_from_system(),
        "end": Time.get_unix_time_from_system() + 15,
        "message": "test float",
    }

    print("Int before:")
    print(x_int)
    var x_int_converted := var_to_str(x_int)
    print("Int converted:")
    print(x_int_converted)
    x_int = str_to_var(x_int_converted)
    print("Int after:")
    print(x_int)

    print("\n")
    print("Float before:")
    print(x_float)
    var x_float_converted := var_to_str(x_float)
    print("Float converted:")
    print(x_float_converted)
    x_float = str_to_var(x_float_converted)
    print("float after:")
    print(x_float)

Result:

--- Debugging process started ---
Godot Engine v4.0.3.stable.official.5222a99f5 - https://godotengine.org
OpenGL API 4.1 Metal - 83.1 - Compatibility - Using Device: Apple - Apple M1 Max

Int before:
{ "start": 1686693113, "end": 1686693128, "message": "test int" }
Int converted:
{
"end": 1686693128,
"message": "test int",
"start": 1686693113
}
Int after:
{ "end": 1686693128, "message": "test int", "start": 1686693113 }

Float before:
{ "start": 1686693113.04444, "end": 1686693128.04444, "message": "test float" }
Float converted:
{
"end": 1.68669e+09,
"message": "test float",
"start": 1.68669e+09
}
float after:
{ "end": 1686690000, "message": "test float", "start": 1686690000 }
--- Debugging process stopped ---

Minimal reproduction project

test-var-to-str.zip

Zireael07 commented 1 year ago

This issue also affects printing stuff via print()

kleonc commented 1 year ago

Related: #76573 (and other issues alike linked in there).

mgc8 commented 1 year ago

I was able to track this issue down to the function num_scientific in file /core/string/ustring.cpp.

We have the big write function in /core/variant/variant_parser.cpp which does this:

Error VariantWriter::write(const Variant &p_variant, StoreStringFunc p_store_string_func, void *p_store_string_ud, EncodeResourceFunc p_encode_res_func, void *p_encode_res_ud, int recursion_count) {
        switch (p_variant.get_type()) {
(...)
                case Variant::FLOAT: {
                        String s = rtos_fix(p_variant.operator double());
                        if (s != "inf" && s != "inf_neg" && s != "nan") {
                                if (!s.contains(".") && !s.contains("e")) {
                                        s += ".0";
                                }
                        }
                        p_store_string_func(p_store_string_ud, s);
                } break;
(...)

rtos_fix calls rtoss which does nothing but call String::num_scientific. Up to this point, the value is always passed as double, so we shouldn't have any problems. However, this is the function in question:

String String::num_scientific(double p_num) {
        if (Math::is_nan(p_num)) {
                return "nan";
        }
        if (Math::is_inf(p_num)) {
                if (signbit(p_num)) {
                        return "-inf";
                } else {
                        return "inf";
                }
        }
        char buf[256];
#if defined(__GNUC__) || defined(_MSC_VER)
#if defined(__MINGW32__) && defined(_TWO_DIGIT_EXPONENT) && !defined(_UCRT)
        // MinGW requires _set_output_format() to conform to C99 output for printf
        unsigned int old_exponent_format = _set_output_format(_TWO_DIGIT_EXPONENT);
#endif
        snprintf(buf, 256, "%lg", p_num);
#if defined(__MINGW32__) && defined(_TWO_DIGIT_EXPONENT) && !defined(_UCRT)
        _set_output_format(old_exponent_format);
#endif
#else
        sprintf(buf, "%.16lg", p_num);
#endif
        buf[255] = 0;
        return buf;
}

Note the two snprintf/sprintf calls, both using the quite weird format %lg. According to the reference, this "converts floating-point number to decimal or decimal exponent notation depending on the value and the precision", and the l in front does absolutely nothing (it's "double" in both cases).

My compiler went the route of the first #if, thus using the snprintf version. Unfortunately, that does not specify any precision, so the result is exactly the problem in the bug:

printf("%lg\n", 123456789.54321);
=> result: 1.23457e+08

This can be mitigated somewhat by adding the precision, like in the sprintf call:

printf("%.16lg\n", 123456789.54321);
=> result: 123456789.54321

However, even this is inferior to the pure and simple %.16f for larger values, where it loses precision earlier, or even the "normal" scientific notation parameter, %e; compare:

printf("%.16f\n", 123456789123456789.987654321);
=> result: 123456789123456784.0000000000000000

printf("%.16e\n", 123456789123456789.987654321);
=> result: 1.2345678912345678e+17

printf("%.16lg\n", 123456789123456789.987654321);
=> result: 1.234567891234568e+17

printf("%.16f\n", 1.23456789123456789);
=> result: 1.2345678912345679

printf("%.16e\n", 1.23456789123456789);
=> result: 1.2345678912345679e+00

printf("%.16lg\n", 1.23456789123456789);
=> result: 1.234567891234568

Notice how the %lg format "gives up" early and at least a digit of precision is lost with both large and small numbers, for no discernible benefit.

I would say the solution would be to either add precision to the existing call, or better yet -- replace it with %f or at least %e if scientific notation is really important for other parts of the code; as far as var_to_str is concerned, however, a simple to_string would seem to be enough... But I don't know the code well enough to make that call, it's for the developers to decide of course.

akien-mga commented 1 year ago

CC @bruvzg @aaronfranke

jpdurigan commented 11 months ago

been working around this issue with JSON.stringify() and full_precision = true

aaronfranke commented 11 months ago

This issue also affects printing stuff via print()

Print should not have the same level of precision as serialization methods. If the user writes 0.1 + 0.2, they would expect to get 0.3. However, due to floating-point error, the result is actually one bit more than the closest value to 0.3, so in some languages this prints 0.30000000000000004. This is a mistake, because too many digits are printed, the user should not see the unreliable digits. However we still need to serialize them in order to ensure round-tripping (if this value was saved as 0.3 and read back, the float you get would not be identical to the original).

mgc8 commented 11 months ago

This issue also affects printing stuff via print()

Print should not have the same level of precision as serialization methods. If the user writes 0.1 + 0.2, they would expect to get 0.3. However, due to floating-point error, the result is actually one bit more than the closest value to 0.3, so in some languages this prints 0.30000000000000004.

Sure, but that is something that should be configurable at runtime. If the default is to "print" with one decimal, then it should output "0.3". If then the developer wants to print with maximum precision, it should print "0.30000000004" or whatever it is. At the current time, this bug causes the internal value to always be truncated and lost, even when the developer wants to use all the precision! And it's not just about decimal values after the point, it also affects things like timestamps, where the whole (integer) part gets massively truncated -- see "1686693128" becoming "1686690000" in my example. I can imagine no possible scenario in which that kind of behaviour would be desirable...

aaronfranke commented 11 months ago

@mgc8 Yes, I was just responding to that one comment. It's a bug that 1686693128 gets turned into 1686690000. It's also a bug that var_to_str does not save with the unreliable digits as it should.

mgc8 commented 11 months ago

@aaronfranke Ah, ok, all agreed then. It should be a pretty simple fix to just add .16 into the first snprintf call, as currently it's a bit concerning that the behaviour varies so drastically based on the compiler used -- and probably the reason this is not a bigger problem, as most people use versions of Godot compiled with something other that GCC where the issue is not as drastic?

In the longer term, it would be good to evaluate the need for that %lg in this code vs. the more capable %f or %e, but as a quick fix at least the above should be quick to implement and test.

TheSofox commented 10 months ago

Made a shot at a PR that fixes this, but... well, you can check the PR directly but essentially the Unit Tests and other parts of the engine were made with the assumption that num_scientific was low precision to begin with so the tests are PR Unit Tests are failing. If we can't make num_scientific more precise, I'm not sure what alternative options are. var_to_string uses num_scientific so that floats can still be seen as floats even in text format.

DanCroweNH commented 2 months ago

Still seeing this in Godot v4.4.dev1 both directly using str_to_var and also when setting floats via @export properties (I'm assuming this is because that's how they're parsed back to floats from the editor)

As an example 123456789 becomes 123457000 in both cases

aaronfranke commented 1 month ago

~~Fixed it here: https://github.com/godotengine/godot/pull/96676~~

~~This required extensive changes to Variant, the unit tests, and the docs to ensure it works everywhere.~~

evanrinehart commented 3 weeks ago

I came across this paper with a summarized history of this topic and several algorithms. Might be of interest to godot. The float printing right now is bad but at least not an as yet unsolved problem.

https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf

aaronfranke commented 2 days ago

@evanrinehart Thank you for linking that paper! I started to read it but then realized we can just grab an existing implementation. I opened a PR here #98750 which is much better than my old PR.

godotengine / godot