godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
89.17k stars 20.21k forks source link

Bug in `fmod` when `a / b` is integer and `b` is small #49354

Open dalexeev opened 3 years ago

dalexeev commented 3 years ago

Godot version: 3.3.2, 4.0.dev.calinou.94c31ba30

OS/device including version: Kubuntu 21.04 (not related)

Issue description: In the Russian-speaking Godot community, one person complained about the "low accuracy" of fmod and gave the following example:

fmod(55.5, 0.1) != 0.0

I checked and it looks like the error only happens when a / b is integer and b is small. A number close to b is returned instead of 0.0. Perhaps this is due to the compensation for the float error, since in some cases fmod gives a more correct result than x_fmod (see example with fmod(6.5, 1.5)).

Thanks to Alexey Zotov for pointing out the error.

Steps to reproduce:

tool
extends EditorScript

func x_fmod(a: float, b: float) -> float:
    return a - int(a / b) * b

func test(a: float, b: float) -> void:
    var r1 := fmod(a, b)
    var r2 := x_fmod(a, b)
    print("%6.2f %6.2f %20.16f %20.16f %-5s %-5s" % [a, b, r1, r2,
            str(r1 == r2), str(is_equal_approx(r1, r2))])

func _run() -> void:
    test(55.5, 0.1)
    print()

    test(6.0, 1.5)
    test(6.2, 1.5)
    test(-6.0, 1.5)
    test(-6.2, 1.5)
    test(6.0, -1.5)
    test(6.2, -1.5)
    test(-6.0, -1.5)
    test(-6.2, -1.5)
    print()

    test(6.0, 0.5)
    test(6.2, 0.5)
    test(-6.0, 0.5)
    test(-6.2, 0.5)
    test(6.0, -0.5)
    test(6.2, -0.5)
    test(-6.0, -0.5)
    test(-6.2, -0.5)
    print()

    test(6.0, 0.05)
    test(-6.0, 0.05)
    test(6.0, -0.05)
    test(-6.0, -0.05)
    print()

    test(6.21, 0.05)
    test(-6.21, 0.05)
    test(6.21, -0.05)
    test(-6.21, -0.05)
    print()

    test(0.05, 6.0)
    print()

    test(6.5, 1.5)
    print()

    test(6.554_634_545_343, 1.567_684_556_575)
 55.50   0.10   0.0999999999999969   0.0000000000000000 False False

  6.00   1.50   0.0000000000000000   0.0000000000000000 True  True 
  6.20   1.50   0.2000000000000002   0.2000000000000002 True  True 
 -6.00   1.50  -0.0000000000000000   0.0000000000000000 True  True 
 -6.20   1.50  -0.2000000000000002  -0.2000000000000002 True  True 
  6.00  -1.50   0.0000000000000000   0.0000000000000000 True  True 
  6.20  -1.50   0.2000000000000002   0.2000000000000002 True  True 
 -6.00  -1.50  -0.0000000000000000   0.0000000000000000 True  True 
 -6.20  -1.50  -0.2000000000000002  -0.2000000000000002 True  True 

  6.00   0.50   0.0000000000000000   0.0000000000000000 True  True 
  6.20   0.50   0.2000000000000002   0.2000000000000002 True  True 
 -6.00   0.50  -0.0000000000000000   0.0000000000000000 True  True 
 -6.20   0.50  -0.2000000000000002  -0.2000000000000002 True  True 
  6.00  -0.50   0.0000000000000000   0.0000000000000000 True  True 
  6.20  -0.50   0.2000000000000002   0.2000000000000002 True  True 
 -6.00  -0.50  -0.0000000000000000   0.0000000000000000 True  True 
 -6.20  -0.50  -0.2000000000000002  -0.2000000000000002 True  True 

  6.00   0.05   0.0499999999999997   0.0000000000000000 False False
 -6.00   0.05  -0.0499999999999997   0.0000000000000000 False False
  6.00  -0.05   0.0499999999999997   0.0000000000000000 False False
 -6.00  -0.05  -0.0499999999999997   0.0000000000000000 False False

  6.21   0.05   0.0099999999999996   0.0099999999999998 False True 
 -6.21   0.05  -0.0099999999999996  -0.0099999999999998 False True 
  6.21  -0.05   0.0099999999999996   0.0099999999999998 False True 
 -6.21  -0.05  -0.0099999999999996  -0.0099999999999998 False True 

  0.05   6.00   0.0500000000000000   0.0500000000000000 True  True 

  6.50   1.50   0.5000000000000000   0.5000000000000000 True  True 

  6.55   1.57   0.2838963190430004   0.2838963190430004 True  True 

Minimal reproduction project:

dalexeev commented 3 years ago

Likewise with fposmod:

print("%.16f" % fposmod(55.5, 0.1)) # 0.0999999999999969
lawnjelly commented 3 years ago

GDScript fmod simply calls fmod in the C runtime library provided by the compiler / chip instructions, and the results are as you say. So this is not a Godot bug, this is pretty standard in programming.

This comes up periodically and seems to be a mismatch of expectation, due to the difference between maths taught at schools, and the maths that computers use internally for floating point. This is something every programmer has to learn, or at least appreciate.

Computers using floating point math is not guaranteed to be correct, in fact it is more often incorrect than correct. If you want correct results, you should confine your operations to integers.

https://en.wikipedia.org/wiki/Floating-point_arithmetic

dalexeev commented 3 years ago

And why do I need your fpos function then, if my code gives a more correct result (within is_equal_approx)?

If this gross error (compared to 0.1 + 0.2 != 0.3) is a standard, then I suggest adding an alternative function or parameter that will switch between the "standard" and the expected, more mathematically correct, result.

This is something every programmer has to learn, or at least appreciate.

At a minimum, this case should be mentioned in the documentation. All Godot developers cannot be familiar with all C bugs, whether it is considered a standard or not.

lawnjelly commented 3 years ago

Yes we should probably have a section in the docs about it, as it comes up quite often.

https://www.phys.uconn.edu/~rozman/Courses/P2200_15F/downloads/floating-point-guide-2015-10-15.pdf

Section 2.1 : Why don’t my numbers, like 0.1 + 0.2 add up to 0.3?

YuriSizov commented 3 years ago

I don’t think we need a section in the docs, but we can provide additional links there somewhere, since this is not strictly about GDScript or Godot in general, but an issue in all programming languages to some degree.

dalexeev commented 3 years ago

The first step is to add a note to the fmod and fposmod documentation.

dalexeev commented 3 years ago

Agree that the difference between 0.0999999999999969 and 0.0 is much greater than the difference between 0.30000000000000004 and 0.3. High-level languages should be more consistent with expectations, even if they have to fix errors in the lower layers.

Is there a practical use for this case? When exactly the wrong result is required, 0.0999999999999969 instead of 0.0? If never, what is the problem with fixing this error?

It's just a very strange spike when the standard function doesn't work correctly:

test(55.4999999999, 0.1)
test(55.5, 0.1)
test(55.5000000001, 0.1)
 55.50   0.10   0.0999999998999951   0.0999999998999925 False True 
 55.50   0.10   0.0999999999999969   0.0000000000000000 False False
 55.50   0.10   0.0000000000999987   0.0000000001000018 False True 
YuriSizov commented 3 years ago

Is there a practical use for this case? When exactly the wrong result is required, 0.0999999999999969 instead of 0.0? If never, what is the problem with fixing this error?

There is no practical use, it's a limitation of computers, they are notoriously bad with floating point math. I think you know that, so I'm not sure how would you propose we fix that.

aaronfranke commented 3 years ago

@dalexeev @pycbouh Floating point math is actually very precise in binary, but the problem is that decimal numbers cannot be exactly represented in binary (just like how 1/3 is 0.3333... forever in our system).

55.5 is treated as 32+16+4+2+1+0.5, so it can be represented as exactly 55.5. On the other hand, 0.1 has recurring digits when converted to binary, it's treated as 0.0625+0.03125+0.00390625+0.001953125+... and it ends up truncated to exactly 0.1000000000000000055511151231257827021181583404541015625. If you run the fmod operation on 55.5 and that number you will get exactly the result you did.

There is no bug here that Godot can fix, except maybe documentation. If you are working with decimal numbers, you have to just expect that they will not always be represented perfectly. Use is_equal_approx which checks floats for approximate equality (handling slight errors). In this case, you will need is_equal_approx(x, b) && is_zero_approx(x).

With 0.1 + 0.2 != 0.3, the problem is that you're not adding 0.1 and 0.2, you're adding 0.1000000000000000055511151231257827021181583404541015625 and 0.200000000000000011102230246251565404236316680908203125 which gives 0.3000000000000000166533453693773481063544750213623046875 but the most accurate representation of 0.3 is actually 0.299999999999999988897769753748434595763683319091796875. There is no way to "switch between the 'standard' and the expected, more mathematically correct, result" because the inputs are different from the expected 0.1 and 0.2. To clarify, the problem occurs before the addition operation even happens, it's a problem with the decimal numbers not being exactly representable in binary.

dalexeev commented 3 years ago

Use is_equal_approx which checks floats for approximate equality (handling slight errors). In this case

Do you realize that the received value is very different from the expected one?

print(is_equal_approx(0.30000000000000004, 0.3)) # True, ok, 0.1 + 0.2 works correctly
print(is_equal_approx(0.0999999999999969, 0.0))  # False, not ok, fmod(55.5, 0.1) works wrong

What I expect when I see this: fmod(6.0, 3.0), fmod(55.5, 0.5), fmod(55.5, 0.1)

6.0  / 3.0 = 2.0,   the remainder is 0 ( 6.0 =   2.0 * 3.0 + 0.0)
55.5 / 0.5 = 111.0, the remainder is 0 (55.5 = 111.0 * 0.5 + 0.0)
55.5 / 0.1 = 555.0, the remainder is 0 (55.5 = 555.0 * 0.1 + 0.0)

It would be ok if I received 0.00000000000000004 or something similar. But

  6.00   3.00   0.0000000000000000   0.0000000000000000 True  True 
 55.50   0.50   0.0000000000000000   0.0000000000000000 True  True 
 55.50   0.10   0.0999999999999969   0.0000000000000000 False False

While my code (a - int(a / b) * b) is working.

You are just too much fixed at the following lines:

fmod(55.5, 0.1) != 0.0

If this gross error (compared to 0.1 + 0.2 != 0.3) is a standard

This is not the problem, I do not propose to fix it, is_equal_approx solves this problem. I understand that direct comparison may not work in the case of floats. The problem is that even is_equal_approx ceases to consider these values equal, because they are really different. The resulting value is significantly different from what is expected.

print(fmod(55.5, 0.1) == 0.0)                # False, ok
print(is_equal_approx(fmod(55.5, 0.1), 0.0)) # False, not ok
lawnjelly commented 3 years ago

You can see here even in a standard compiler fmod(55.5, 0.1) the result is 0.099999: https://godbolt.org/z/KezGPTfzz

Perhaps you can investigate by asking some experts - runtime library authors or ask on stack overflow etc, and ask them why they do not use your version (a - int(a / b) * b).

See also e.g.: https://cs.stackexchange.com/questions/13810/what-is-the-reason-of-inaccuracy-of-operations-on-float-numbers

aaronfranke commented 3 years ago

You are just too much fixed at the following lines:

@dalexeev Please re-read my comment more closely.

In this case, you will need is_equal_approx(x, b) && is_zero_approx(x).

kleonc commented 3 years ago

@dalexeev The main problem here is not the floating point (in)precision but the math done wrong. You conclusions are invalid. Result of fmod(a, b) function is not in the real numbers space (I doubt I'm using proper formal math language here), it's in the modulo-b space. So if the correct result of fmod(a, 1.0) is 0.0 but you're getting 0.99 then it doesn't mean that absolute error is |0.0 - 0.99|. The 'real' absolute error is in the modulo-1.0 space too and thus it equals to min(|0.0 - 0.99|, |1.0 + 0.0 - 0.99|) = 0.01. And is_equal_approx works in the real numbers space so you can't just verify the result of fmod the way you do it.

In the similar manner you could wrongly conclude that e.g. Vector2.angle() method also doesn't work properly:

print("%.30f" % Vector2(-1, 0.01).angle())  #  3.131592988967895500000000000000
print("%.30f" % Vector2(-1, 0.00).angle())  #  3.141592741012573200000000000000
print("%.30f" % Vector2(-1, -0.01).angle()) # -3.131592988967895500000000000000

But again, such conclusion would be based on the wrong assumptions/expectations.

dalexeev commented 3 years ago

I understand that the wrong (in some cases) result is the standard. But this behavior greatly devalues ​​this function. Ultimately, all that matters is that the result is wrong. Not imprecise, but highly imprecise.

We can replace fmod with fmod_std and fmod_math. I would rather be able to use an even slower option than worry about the function behaving incorrectly at certain values.

And, as already said, we need to add a note to the documentation anyway.

So if the correct result of fmod(a, 1.0) is 0.0 but you're getting 0.99 then it doesn't mean that absolute error is |0.0 - 0.99|.

No, this is called abstraction flow. When we say that the result can differ significantly from the mathematical one, this is slyness. You can say that 0.1 + 0.2 = 0.300...004, but you cannot say that 0.1 + 0.2 = 0.4. It is normal for calculations to stop being accurate when values become too large numbers or too small fractions, when it simply cannot be represented exactly. But it’s very bad when expectations break down on such mundane values. We can represent the numbers 55, 0.1 and 0 with sufficient accuracy, so it makes sense that the user would expect the operation to work correctly for these values.

Epilogue. float is one of the few things in programming that I deeply dislike. You need to be at least a mathematician with a good knowledge of float standards in order not to fall into its traps. At the same time, all programmers are forced to use float, regardless of their field of activity and skill level. This is one of those things that is actually much more difficult than it sounds. Everyone complains about problems with null, recursion, regular expressions, etc., and few people point out how difficult float is and how to avoid problems associated with it.

aaronfranke commented 3 years ago

@dalexeev Again though, when you write fmod(55.5, 0.1), you are actually writing fmod(55.5, 0.1000000000000000055511151231257827021181583404541015625). Even though in math 55.5 ÷ 0.1 = 555 and therefore the remainder is zero, in programming 55.5 / 0.1 becomes 55.5 / 0.1000000000000000055511151231257827021181583404541015625 which is actually 554.999something. If we do the truncated division and remainder, that's 554 + 0.999something. This also means that this problem is not isolated to fmod, since casting the result of that division to an integer will give 554.

If you want to reliably check if something is a multiple of 0.1, you need to use the code snippet I posted.

dalexeev commented 3 years ago

I understood.

One of the main problems with float calculations is that it does not take into account the current precision of a number. Actions on the mantissa and exponent are performed with all available digits, without dividing them into exact and inaccurate.

In the case of fmod, this reliance on all bits of the imperfect representation is fatal.

dalexeev commented 3 years ago

If you want to reliably check if something is a multiple of 0.1, you need to use the code snippet I posted.

Okay, so why can't the GDScript implementation do this check?

func fmod(a: float, b: float) -> float:
    var x := Math.fmod(a, b)
    if is_equal_approx(x, b) and is_zero_approx(x):
        return 0.0
    return x

Or at least add an alternative function (fmod_safe) to do this check? Even if it is slower, I am ready to accept it.

timothyqiu commented 3 years ago

@dalexeev The current precision is honored, but 0.1 has an infinitely repeating fraction in binary.

Like the result of decimal 1 ÷ 7 = 0.142857142857... is repeating the 142857 part. The result of binary 1 ÷ 1010 (decimal 1 ÷ 10) = 0.000110011... is repeating the 0011 part. The float64 format provides 52 significant figures, so the best you can store is 0.0001100110011001100110011001100110011001100110011001101 (which is 0.1000000000000000055511151231257827021181583404541015625 if you convert it to decimal, and still prints 0.1 as the print function shows about 6 digits by default).

Your a - int(a / b) * b version is "mathematically correct" by accident because it introduces extra rounding errors during the calculation: the value has to be truncated to 52 significant figures after evaluating each operator. These rounding errors are what fmod is trying to avoid.

a / b: 110111.1 ÷ 0.0001100110011001100110011001100110011001100110011001101

1000101010.11111111111111111111111111111111111111111111011101010100000 ... also infinitely repeating
1000101011.000000000000000000000000000000000000000000 up to 52 digits, extra digits rounded

int(a / b) * b: 1000101011 * 0.0001100110011001100110011001100110011001100110011001101

110111.1000000000000000000000000000000000000000000000001101111 not decimal 555
110111.1000000000000000000000000000000000000000000000 up to 52 digits, extra digits rounded
aaronfranke commented 3 years ago

@dalexeev This very likely shouldn't be done in GDScript's fmod, as it makes things slower and may not even be desired in some cases. An alternative function could be considered, but I think the best option is just to define this function in projects that need it. Maybe you could propose adding this function to Goost or some other project?

dalexeev commented 3 years ago

Maybe you could propose adding this function to Goost or some other project?

If the official documentation will have a link to the correct solution to the problem, I agree.

reduz commented 3 years ago

I suggest we add a warning (that can be disabled) in GDScript when users do float == 0 or float != 0. Maybe even against just int. This happens often and is unexpected enough that users end up confused.

Something like "Comparison of floating point against integer value may fail due to natural inaccuracy of decimals. Suggest is_equal_approx(a,b) instead." or similar. Tagging @vnen

dalexeev commented 3 years ago

Let's just add a warning to the documentation and close this issue. Since this problem really arises even at the stage of representing the literal of a number in binary form, before that fmod starts computation. This is a float issue, not Godot.

But, I believe that this really should be a "WARNING:", not a "Note:", because this behavior can cause not just small inaccuracies, but give qualitatively different, not expected, wrong results.