Make division operator less error prone without a breaking change

bluenote10 commented 3 years ago

Describe the project you are working on:

Not so relevant in this case.

Describe the problem or limitation you are having in your project:

Encountering bugs caused by ambiguity of float vs integer division.

Describe the feature / enhancement and how it helps to overcome the problem or limitation:

This is a follow up proposal to #1866. Overall I think @dalexeev's proposal would be a much cleaner solution purely from a language design perspective, but has the downside of:

the breaking change,
people coming from statically typed languages expect a single division operator (which however can be turned around to: people coming from dynamically typed languages expect float and integer operators to be different operators).

The idea of this proposal is:

Make no change of the behavior of /, i.e., keep a single division operator.
Provide some linting rules that help to catch the majority of the typical division bugs arising in dynamic contexts.

The fundamental question is if GDScript is closer to a dynamically or a statically typed language.

Python 3 and GDScript both support type annotations, but there is an important difference. In GDScript type annotations actually lead to type conversions, whereas in Python they don't:

Python

def f(x: float):
    print(type(x))

f(42)
# prints: <class 'int'>

GDScript

func f(x: float):
    print(typeof(x) == TYPE_REAL)   # should be called TYPE_FLOAT

f(42)
# prints: True

This also means that in Python having the classic single / division behavior would be a bug even with type annotations:

def f(a: float, b: float):
    # type annotations don't convert potential ints to float, i.e., the division would not necessarily
    # be a float division
    assumed_float_division_result = a / b

f(5, 2)

Compared to GDScript with type annotations:

func f(a: float, b: float):
    # type annotations do convert anything to float, i.e., the division is guaranteed to be a float division
    var assumed_float_division_result = a / b

f(5, 2)

Thus, the way type annotations work in GDScript bring it closer to a statically typed language. In general I would conclude that the division semantics are sane if the types of the operands are known -- like it is the case in statically typed languages.

This leaves the case when operands have unknown types. Omitting type annotations opens the door for the classic division bug of dynamically typed language: Integer values creep into a place where they aren't expected and thus the wrong return type is obtained:

func f(a, b):
    # the classic division bug
    var assumed_float_division_result = a / b

f(5, 2)

The idea of this proposal is to at least highlight / catch such cases via a linter rule producing something like:

var assumed_float_division_result = a / b
# Encountered error-prone division with unknown operand types. Consider giving the operands an explicit type.

There would be no linter warning if:

both operand types are known (which includes the common int / int case)
only one of the operands is known and it is a float. This case should be fine, because integer division can already be excluded.

The latter would allow to silence the warning in the line above via the typical:

var assumed_float_division_result = float(a) / b

There are some use cases where users really want to have full dynamic typing (see this as an example), in which case they really cannot annotate the types of the operands at all. In such a case the user could explicitly silence the linter to express "trust me, I know what I'm doing here". To illustrate on the generic normalize_values function:

func normalize_values(values, reference_value):
    var new_values = []
    for value in values:
        # Deliberately apply the "multiply by 1.0" trick to convert integers to float
        # in a duck typing compatible way. Requires to tell the linter that we know
        # that this untyped division is fine.
        new_values.append(value * 1.0 / reference_value) # nolint
    return new_values

(I don't know if there is already a mechanism to silence the linter on a per line basis. Just using the # nolint as an example here.)

The feasibility of this proposal depends on whether GDScript's type inference system is already strong enough to infer the type in non-trivial cases, i.e., could it see that function_returning_float() * 1.0 * local_int * local_float is of type float?

This basically leads back to the question: Is GDScript statically typed "enough" to afford a / operator like in true statically typed languages, or is its nature still more dynamic?

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:

See above.

If this enhancement will not be used often, can it be worked around with a few lines of script?:

Language property.

Is there a reason why this should be core and not an add-on in the asset library?:

Language property.

dalexeev commented 3 years ago

Note that the proposed change to the behavior of the division operator does not actually break compatibility as much as it might seem at first glance. See comment for details.

akien-mga commented 3 years ago

That's a good proposal, but it is actually partially implemented already:

    var hours := 1337
    var days = hours / 24
    print(days)

raises:

W 0:00:00.657   Integer division, decimal part will be discarded.
  <C++ Error>   INTEGER_DIVISION
  <Source>      level.gd:13

I say partially because it seems to only happen when at least one of the operands has a type hint (here inferred to int). This doesn't trigger any warning:

    var hours = 1337
    var days = hours / 24
    print(days)

This might be considered a bug.

Aside from that I agree with your analysis about the (increasingly) statically typed nature of GDScript (even more so GDScirpt 2.0) which makes it more appropriate for it to behave like a statically typed scripting language (like C#) than a fully dynamically typed scripting language (like Python 3).

Reflecting upon type hints and their different behavior in GDScript and Python 3, it's also worth noting that type hints were added in Python 3.6 so they were not part of the decision making for introducing the // operator. And your further explanations on how type hints don't coerce types in Python definitely make the need for separate operators very clear.

dalexeev commented 3 years ago

I say partially because it seems to only happen when at least one of the operands has a type hint (here inferred to int).

This might be considered a bug.

@akien-mga That is, after fixing the bug, the following code should produce Integer Division warning?

var a = 5 / 2

Thus, the purpose of Integer Division warning is to familiarize the user with the mathematically unexpected behavior of the division operator and then be disabled in the project settings?

bluenote10 commented 3 years ago

Thus, the purpose of Integer Division warning is to familiarize the user with the mathematically unexpected behavior of the division operator and then be disabled in the project settings?

I don't fully understand the current Integer Division warning yet, but it doesn't make so much sense to me. In general, the linter shouldn't be a tool to teach the language, but rather a tool to hint for potential bugs. If we fully commit to the current / division, we at some point have to assume that users mean what they have written. Then it doesn't make sense to hint every int / int division to remind the user that this is an integer division. Basically such a lint would merely tell the user that the language has a design bug ;).

In any case, under my proposal 5 / 2 would be no warning because both types are statically known, and then there can be at least be no bugs from dynamic typing -- only from lack of language behaviour knowledge.

The current Integer Division warning also indicates that perhaps 5 // 2 would be clearer ;)

bluenote10 commented 3 years ago

@reduz

Having seen your thoughts on IRC, I'd really like to reach out to you. While I fully trust you in so many things, I feel like you may be missing an important aspect in this particular case.

It's the same thing either way, you can hit a bug dividing two integers and getting a float unexpectedly, or dividing two integers and getting an integer unexpectedly.

As far as I can see, this really isn't the case. The source of bugs is not just about a developer expecting one behavior, while it is the other way around. The core issue is the "integer-finding-their-way-into-expressions-bug", and that only exists with the single division operator. This bug cannot occur with the two division solution! So it is wrong to conclude you can go either way, and you just get the bugs from mixing up the behavior. The source of bugs really is asymmetric. There is entire class of bugs that can only occur in one direction, but not in the other. This is not a "tabs vs spaces" discussion as many mistakenly view it.

Incidentally the original issue triggering the proposal was exactly such a "integer-finding-their-way-into-expressions-bug", in this case in disguise of replacing an internal type having changed from Vector2 to Vector2i (the viewport) which had messed up the semantics of an internal division (dividing the x / y of the vector). A nasty time-wasting bug that would have been entirely preventable. Note that with the current division operator we would have to write float(vector.x) / vector.y everywhere to be safe w.r.t. to any potential Vector2 => Vector2i refactoring, which is totally crazy.

Since you haven't touched the "integer creeping in problem" in your reasoning, it makes sense to arrive at the conclusion:

I also dislike this from python 3 and feel it was completely unecesary

I tried my best to illustrate this problem in the example above, but perhaps PEP 238 does a better job:

Python doesn't have argument type declarations, so integer arguments can easily find their way into an expression.

The problem is particularly pernicious since ints are perfect substitutes for floats in all other circumstances: math.sqrt(2) returns the same value as math.sqrt(2.0), 3.14100 and 3.14100.0 return the same value, and so on. Thus, the author of a numerical routine may only use floating point numbers to test his code, and believe that it works correctly, and a user may accidentally pass in an integer input value and get incorrect results.

After using Python for 15 years, I am absolute certain that this has prevented many bugs, and it was a wise decision. In the early days it was a typical issue to run some_algorithm(epsilon=0.001) successfully, but running some_algorithm(epsilon=1) suddenly lead to weird/buggy behavior due to the integer creeping into an internal division which had switched the division semantics. Note that these bugs are silent, and were quite tedious to identify. It also often raised the question who is supposed to solve the problem, because library authors would argue that you are not supposed to pass in integers. PEP 238 has solved these issues for good.

As long as you view the Python 3 decision as "completely unnecessary", you do not seem to have a good foundation to make an informed decision.

That being said, what the example also illustrates however, is why it is much less of an issue in GDScript if type annotations are involved thanks to type conversions.

The question you may have to answer for yourself: Would you like to support writing numerical routines without type annotations?

If so, splitting the division would be highly recommended to prevent the "integer-creeping-in-problem".
Otherwise the above proposal should be sufficient to catch many bugs and force users more towards using GDScript in a statically typed style.

dalexeev commented 3 years ago

Having seen your thoughts on IRC, I'd really like to reach out to you. While I fully trust you in so many things, I feel like you may be missing an important aspect in this particular case.

Yes it would be nice. Unfortunately, I am not very accurate in expressing my thoughts in English, and much of this is unclear to those who do not yet see the problem. But since you understand what I wanted to say and you also have your own arguments, you could better explain this.

After using Python for 15 years, I am absolute certain that this has prevented many bugs, and it was a wise decision.

Yes, this decision should be made by several people, not looking back at the shouts of the short-sighted crowd.

aaronfranke commented 3 years ago

After using Python for 15 years, I am absolute certain that this has prevented many bugs, and it was a wise decision. In the early days it was a typical issue to run some_algorithm(epsilon=0.001) successfully, but running some_algorithm(epsilon=1) suddenly lead to weird/buggy behavior due to the integer creeping into an internal division which had switched the division semantics. Note that these bugs are silent, and were quite tedious to identify. It also often raised the question who is supposed to solve the problem, because library authors would argue that you are not supposed to pass in integers. PEP 238 has solved these issues for good.

To me, the problem is entirely that this happens in Python:

def f(x: float):
    print(type(x))

f(42)
# prints: <class 'int'>

The problem as you described does not exist if the types are actually what was expected. If a user specifies a static type, the inputs should always be of that type, or else the program should either coerce the value or throw an error instead of silently failing. This behavior in Python is really dumb (to me, anyway) and I don't know why they would do this.

As described by you in the OP, in GDScript, this doesn't happen. If users specify floats, they will get floats, and vice versa, and then they will get the expected division behavior.

bluenote10 commented 3 years ago

@aaronfranke

This behavior in Python is really dumb (to me, anyway) and I don't know why they would do this.

Various reasons, but mainly because it has to support union types. Imagine a function taking an x which can either be type A or B (type annotation would be Union[A, B]). There is no meaningful way to invoke a type conversion on a union type. It is exactly the same in TypeScript. I was actually surprised by this behavior in GDScript because it would make it tough to get union types into the language.

The problem as you described does not exist if the types are actually what was expected.

That is only partly correct, because it in a dynamically typed language it cannot always be clear what is expected.

Look at the issue how @fracteed run into this. His code was basically:

var aspect_ratio = get_some_vector().x / get_some_vector().y

This code worked fine as long as get_some_vector() returned a Vector2, but broke when it changes internally to returning a Vector2i. Could he have have avoided running into this bug in the first place? The only way to safely avoid trouble from Vector2 to Vector2i would have been to write it as:

var aspect_ratio = float(get_some_vector().x) / get_some_vector().y

But it is also understandable why he didn't write the float conversion as long as his type expectation was still correct. You only need it when it's too late ;).

A similar issue could arise if you have a helper function:

func some_helper_function(some_vector: Vector2) -> Vector2:
    # ...

and at some point you figure that this helper function is actually useful for Vector2i as well. Depending on what you want to achieve, you may have to drop the Vector2 type annotation to enable full duck-typing (the use case of union type annotation). If there is a division involved in the function you may have introduced a bug in doing so.

aaronfranke commented 3 years ago

There is no meaningful way to invoke a type conversion on a union type.

If it were up to me, I would have it throw an error in this case.

I was actually surprised by this behavior in GDScript because it would make it tough to get union types into the language.

I don't see how. If something is of one of the two types in the union, it works. If not, throw an error. It's a vastly better idea to throw an error to let the user know something's gone wrong than doing what Python does and just let the invalid value through.

The only way to safely avoid trouble from Vector2 to Vector2i would have been to write it as:

This is a one-time compatibility breaking change. The solution isn't to cover your code in casts just in case something changes, the solution is to update your code when something does change. (But in this particular case, he should use .aspect()).

bluenote10 commented 3 years ago

If it were up to me, I would have it throw an error in this case.

I don't see how. If something is of one of the two types in the union, it works. If not, throw an error.

In general I fully agree that this would be a sane runtime behavior, but such an approach is tricky when it comes to collections. Perhaps proposal https://github.com/godotengine/godot-proposals/issues/192 gains some traction and we will support typing for arrays and dicts in the future. Now assume we have a function func f(array: Array<X>), and this function gets called with a huge array of million of elements. Do you now do a runtime check (or even element-wise type conversions), i.e., iterate over millions of elements to verify/ensure they really are an X? This requires a smart solution to avoid huge runtime overheads.

Perhaps I should have said: The reasons why Python and TypeScript are as they are is mainly defined by backwards compatibility -- they have to behave exactly the same with and without type annotations. GDScript certainly has the potential to do better, but these implicit runtime type conversions/checks in a dynamically typed language are somewhat uncharted territory with interesting implications.

This is a one-time compatibility breaking change. [...] the solution is to update your code when something does change.

You can always argue in favor of an error prone behavior that the solution is to simply update your code, i.e., fix bugs when they occur. Good design tries to avoid situations like "if you change this, don't forget to also change that". The whole point of #1866 was that such changes are not breaking, such bugs cannot occur in the first place, and no code updates are necessary.

YuriSizov commented 3 years ago

I have to agree with @aaronfranke, that the issue that started it all was nothing more than a compatibility break between version, and the problem would've been the same if they used a function, that used to return 1 by default in 3.x but started to return 0 by default in 4.x.

In this regard you cannot create code that is 100% safe from future compatibility changes. It's entirely incidental that in this case the break is related to ints vs floats, and their division behavior.

bluenote10 commented 3 years ago

It's entirely incidental that in this case the break is related to ints vs floats, and their division behavior.

No it is not. It is a textbook example of the "integer creeping into a division expression which changes division semantics" bug.

dalexeev commented 3 years ago

Please also look at these comments: one, two.

1. Some GDScript functions return int or float (depending on the type of the argument), although the documentation says they return float.

# sign, abs, clamp, min, max
print(var2str(abs(2)))   # 2
print(var2str(abs(2.0))) # 2.0

# floor, ceil, round, pow
print(var2str(floor(8)))   # 8.0
print(var2str(floor(8.0))) # 8.0

2. The print() function, which many people use for debugging, does not distinguish between int and float with a zero fraction.

print(1)   # 1
print(1.0) # 1

3. Currently, the == operator forbids comparing values of different types (see godotengine/godot#26375). However, an exception is made for the int – float pair. However, new types with the i suffix cannot be compared.

print(2 == 2.0) # True
print(Vector2() == Vector2i()) # Error

dalexeev commented 3 years ago

With regard to this proposal. Unfortunately, my proposal met with resistance (not criticism, but resistance). I'm not sure if we can prevent all errors while keeping the current behavior of the division operator. Yes, if we cannot achieve a better alternative, then we need to at least minimize the damage that the current counter-intuitive behavior of the division operator does. However, I still believe that my proposal should be considered. I do not understand the position taken by core developers: if many people are against, then the proposal should not even be considered (therefore the proposal is closed and the discussion is blocked).

First, the power gap is not so unconditional: 58% versus 42% (not 95% versus 5%, for example). Second, the sample was not representative. Only a small part of the community voted: people interested in the development of the engine, registered on Twitter (992 people voted against). Third, it is unlikely that most people have deeply researched this problem.

ghost commented 3 years ago

Python has // which is divide and round down. Is quite handy and easy to write and forces the result to always be an int in any operation.

dalexeev commented 3 years ago

@filipworksdev This proposal is not about that ("Make division operator less error prone without a breaking change"). See #1866.

aaronfranke commented 3 years ago

@filipworksdev Keep in mind that there is a difference between truncated division (the behavior of / integer division in GDScript and C++ and C# and most languages) and floored division (// in Python). The behavior is different with negative numbers. The moral of the story is that adding something like Python's // is a separate proposal that's independent of this one.

godotengine / godot-proposals

Make division operator less error prone without a breaking change #1888