Open ator-dev opened 1 year ago
This is also useful with respect to godotengine/godot#76843, because narrowing a nullable type (e.g. int?
) to its non-nullable equivalent (e.g. int
) is also a form of narrowing for which all the above considerations apply.
I feel like each option has a different strength:
is String
, which IMO is a really clear and intuitive way of type-checking the matched expression:
match foo:
is String:
pass # foo is a String, at least initially!
var bar as String
, which IMO does a better job at conveying the idea that "bar" is a (shallow) copy of the matched expression with a hard type:
match foo:
var bar as String:
pass # bar is a String and nothing else!
So... why not introduce both? Type-checking pattern with is String
and Type-narrowing on an ad-hoc variable with var bar as String
.
Edit: I would actually like option B much more if we just left out the is
. Is that token needed for the parser?
Looking over the options again, I think is String
doesn't read as well as var bar as String
.
The match syntax in GDScript is
match expression:
pattern:
# block
One of the nice things about the available match patterns in GDScript is that for the purposes of understanding code, they can all be read aloud as essentially
if expression == pattern:
# block
Even patterns like [42, ..]
can still be read aloud as expression == [42, ..]
, even if it isn't technically valid GDScript. The existing var new_var
pattern is the biggest exception in my opinion because the phrase expression == var new_var
doesn't really make much sense when read aloud. Given that it is already a part of the language, it probably makes the most sense to read it as expression == <anything, and I'll call it "new_var">
.
With that in mind, I read option A's
match expression:
var new_var as String:
as if expression == <any string, and I'll call it "new_var">:
On the other hand, option B's
match expression:
is String
doesn't read the same way because if expression == is String
doesn't make any sense.
I am now in favour of option B (with a modification), after reading the comments left here and giving the issue much more consideration. I initially advocated for option A in the belief that it would be more consistent, as it aligns more with the way we declare variables.
As shown below, a syntax which requires prepending an arbitrary variable name confuses the overall semantics, and makes it difficult to see at a glance what is being checked against. I am including potential future extensions of the syntax. In introducing this significant new feature, it is important that we consider how future adaptations would interact with it.
match foo:
as int:
pass
var vector3 as Vector3: # Difficult to compare; offset; add/remove var name from the start
pass
as Node2D:
pass
var some_node as MeshInstance3D | CollisionShape3D | PhysicsBody3D: # Possible future syntax
pass
var node as Node3D:
pass
as Control:
pass
var vector2 as Vector2:
pass
var str = "string which is long and easy to mistype": # Possible future syntax
pass # Messy. We are now comparing values, but it is difficult to tell the difference.
"a string":
pass
var attitude = "hostile", "neutral", "friendly": # Possible future syntax
pass
"A", "B":
pass
The below variation, in comparison, is beautiful and readable - in the mode of GDScript.
match foo:
is int:
pass
is Vector3 var vector3:
pass
is Node2D:
pass
is MeshInstance3D | CollisionShape3D | PhysicsBody3D var some_node: # Possible future syntax
pass
is Node3D var node:
pass
is Control:
pass
is Vector2:
pass
"string which is long and easy to mistype" var str: # Possible future syntax
pass
"a string":
pass
"hostile", "neutral", "friendly" var attitude: # Possible future syntax
pass
"A", "B":
pass
Crucially, the case statements begin with the indicator of the checking operation. Instead of having the analyser/interpreter look past a var ...
every time just to see what type of equality check is used (and determine what is valid), the type of statement can be inferred just from what it starts with, making it easy to parse for machine and human alike. Further, when a line has been partially typed or is being edited, it is much more likely that the analyser can infer the intent.
match foo:
var bar█ # Is it a type check? A value check? A fallthrough? Who knows...
match foo:
var b█is Vector2: # The user goes to the line start to add/remove a var, confusing the analyser
The fact is, the var ...
is incidental to the case statement. It has no effect on the execution condition for the case block, yet by putting it at the start, we imply greater importance and compel users to type it first. Much easier to trivially add and remove var ...
s at the end of the lines, having no effect on the "look and feel" of the match. This way we also do not encourage users to "align" their variable name lengths for clarity.
With is ...
, we always place the operative part at the beginning.
You may notice that I omitted as
before var <name>
. Is there any situation in which we would place some other keyword before var
? Besides, as
already has a meaning which is distinct from variable assignment. I think that var
alone should be sufficient, and is less verbose.
@nlupugla
if expression == is String
doesn't make any sense
You could also view each case as having an implicit operator at the beginning. This makes is
the natural choice.
match foo:
== "a string": # Implied by lack of anything else
pass
is String: # Explicit
pass
dict_matches { "key": "value" }: # Implied by {
pass
array_matches [ "item", .. ]: # Implied by [
pass
You could also view each case as having an implicit operator at the beginning.
Agreed. I like the way you wrote out the operator implied by the string, dict, and array example above.
This makes is the natural choice.
You lost me here. If all other match patterns start with an implied operator, why should this new pattern be the only one that starts with an explicit operator? That's essentially my complaint about starting with is
.
It's about existing language semantics. All of the other cases use an implied equality/match operator (==
, dict_matches
, array_matches
). as
is not an equality operator, it is a function that coerces the type of a value into a narrower type. I believe is
is a better fit, because it is an equality operator, just one that works on type.
It's about existing language semantics. All of the other cases use an implied equality/match operator (
==
,dict_matches
,array_matches
).as
is not an equality operator, it is a function that coerces the type of a value into a narrower type. I believeis
is a better fit, because it is an equality operator, just one that works on type.
If the choice is between starting the line with is
versus as
, then I definitely agree it makes more sense to start with is
. I'm arguing that to be consistent with other match patterns, the line shouldn't start with any kind of operator. That being said, our conversation on RocketChat is beginning to convince me that starting the line with is
is important due to concerns with readability and variable shadowing.
I'd like to bring up an alternative that @SlugFiller, @ator-dev, and I discussed recently on RocketChat.
Instead of extendingmatch
to enable safe type narrowing, another approach would be to introduce an expression that both returns a boolean and binds a variable.
For example, consider a new expression of the form
x is var new_var : Type
This expression returns the result of x is Type
and creates a new local variable of type Type
.
If x is Type == true
, then new var
is initialized exactly as if it had been declared using var new_var : Type = x
in the same block.
Otherwise, it is declared as if the user had written var new_var : Type
in the same block.
As code, it would be roughly equivalent to
var new_var : Type
if x is Type:
new_var = x
x is Type
Here's an example of how this expression could enable safe type narrowing.
func _input(event: InputEvent) -> void:
if event is var event_mouse : InputEventMouse :
pass # react to mouse using "event_mouse" local variable
else if event is var event_key : InputEventKey :
pass # react to keyboard using "event_key" local variable
else:
pass # react to any other event using "event" parameter
Alternatively, using match
:
func _input(event: InputEvent) -> void:
match event:
var event_mouse : InputEventMouse : # implied expression is "event is var event_mouse : InputEventMouse"
pass # react to mouse using "event_mouse" local variable
var event_key : InputEventKey : # implied expression is "event is var event_key : InputEventKey"
pass # react to keyboard using "event_key" local variable
var event_other # the usual fall through pattern can now be understood as "event is var event_other : Variant"
pass # react to any other event using "event_other" local variable
As you can see, this approach helps justify the current match fall through pattern, whose syntax I find somewhat funny compared to the other match patterns.
match x:
var y: # Why is this the pattern for fall through? It can now be understood as "x is var y : Variant", which is always true and always binds "x" to "y"
One important objection that was mentioned was the danger of relying on default/uninitialized values for the cases where x is var y : Type
evaluates to false. This could potentially be worked around by allowing the user to supply a fallback value using syntax such as x is var y : Type ?? z
, although it is admittedly clunky looking.
Specific syntax aside, the core idea is to solve explicit narrowing by introducing a new expression. I think this approach has a few merits:
if
and match
. The fewer weird edge cases for keywords the better.I should have mentioned, this might be better in a separate proposal, even though it does relate to this one. It's quite a complicated topic that would take effect all over the language. I recommend copying it over if you want more specific discussion. (:
I will do my best at answering these points though.
We don't have to add extra functionality to if and match. The fewer weird edge cases for keywords the better.
The thing is, I don't think the proposed change to match
is a weird edge case. It's a natural extension, just one that requires an additional keyword for disambiguation. As I mentioned earlier, there are already 3 types of matching: ==
, dict_matches
, and array_matches
. By adding another one we allow the user to make even fuller use of the match
construct; appropriately since in type theory, types extend to specific values, making type checking arguably an extension of value checking.
As a standalone expression, it is easier to test. It doesn't need to be evaluated in a bunch of different contexts (except maybe for getting the local scoping of the new variable to work correctly)
This needs a separate discussion I think. I'm not convinced, because I don't think we have any expressions that are anything more than a series of operations that equates to an in-place value. Scoping of this would be difficult.
The new expression could potentially be used to solve other problems that come up in the future.
Perhaps, but I also see value in introducing specific syntax to for
, while
, if
, or whatever else needs it, just as with match
. It may be versatile, but it does not look nice to read, perhaps due to being so generic.
Edit: Actually I acknowledge that the match
cases look quite readable, despite having an additional :
(which I am not sure about either - it would make interpreting more complicated). However, I am not sure how this would extend to allowing var
s in more places, since you are saying that var
at the start implies a type check. I view this as an arbitrary language decision to be wary of.
I tend to agree that an expression that atomically both returns a boolean and declares a variable is the way to go.
an additional : (which I am not sure about either - it would make interpreting more complicated)
I think it would be better to use as
instead of :
for the expression, like:
if event is var event_mouse as InputEventMouse:
pass # event_mouse can be used here
Or better, copy C# syntax, which is clear and concise:
if event is InputEventMouse event_mouse:
pass # event_mouse can be used here
One important objection that was mentioned was the danger of relying on default/uninitialized values for the cases where
x is var y as Type
evaluates to false
I believe you don't have to rely on uninitialized values, because you'd statically know that on that branch the variable would not "exist", and you should not be able to even reference it. For example, it's important to allow this pattern:
if not x is int i:
# print(i) # would be a static compiler error, as i is undefined in this branch
return null # useful when I KNOW Variant x is int and don't want more unnecessary nesting
my_int_fn(i)
That's how it works in C#, btw - you can even add compound logic with and
and or
on the if. I think (correct me if I'm wrong) you can always statically know which one of the 2 if/else branches will never have the new variable. For the other branch, I believe you can always know if will be initialized or if it might be initialized - and if it's not GUARANTEED, you should not allow referencing it at all either. This would force a better and clearer refactoring of the code.
For example:
if a and x is int i:
print(2*i) # i is ALWAYS initialized here
else:
return # i is NEVER initialized here, so not accessible
if a or not x is int i:
return # i is NEVER initialized here, so not accessible
# i is ALWAYS initialized here
if a or x is int i:
print(2*i) # ⛔ should give compiler error because i is NOT guaranteed to be initialized here
fn()
else:
return # i is NEVER initialized here, so not accessible
# a working refactoring of the example above
if a or x is int: # no variable declaration
if x is int i: # only introduce it separately
print(2*i) # i is ALWAYS initialized here
fn() # i is NOT accessible here, since it's completely out of scope
else:
return # i is NOT accessible here, since it's completely out of scope
# alternative valid refactoring
if a or x is int: # no variable declaration
print(2*i if x is int i else null) # type-safe since print expects Variant anyway
fn() # i is NOT accessible here, since it's completely out of scope
else:
return # i is NOT accessible here, since it's completely out of scope
After godotengine/godot#80247 (#632), I remembered our discussion in Godot Contributors Chat and and the following syntax began to look more consistent to me (I just remembered if we have other declarations in which we cannot specify the type):
match foo:
# Optional type specifier for match bind pattern.
var bar: String:
# ...Actions with bar...
Although in this case the type specifier is for filtering, it is different from for
, which does not skip elements that do not match the type.
But actually I'm not sure if we should use match to narrow the types. In my opinion the following current solutions are acceptable:
var foo: Node
if foo is Node2D:
var bar: Node2D = foo
# Actions with `bar`...
var foo: Node
var bar := foo as Node2D
# If `foo` is not `Node2D`, then `bar` equals `null`.
if bar:
# Actions with `bar`...
As for nullable types, in my opinion they are not much different from other types (Type?
is just a supertype of Type
). The only problem is that currently the as
operator works inconsistently for Variant types, causing a runtime error instead of returning the default value of the given type. If we change this, then this will work:
var foo: Variant
var bar := foo as String?
# If `foo` is not `String?`, then `bar` equals `null` (default of `String?`).
if bar:
# Actions with `bar`...
var foo: Variant
var bar := foo as String
# If `foo` is not `String`, then `bar` equals `""` (default of `String`).
if bar:
# Actions with `bar`...
The disadvantage of these options is that you cannot specify a different default value in case of an error, or check if there was an error (in the case of as
). But you can always use is
:
var foo: Variant # Or `String?`. Doesn't matter, any `String` supertype.
if foo is String:
var bar: String = foo
# Actions with `bar`...
else:
# Error processing...
var foo: Variant # Or `String?`.
var bar: String = "default"
if foo is String:
bar = foo
# Actions with `bar`...
But actually I'm not sure if we should use match to narrow the types.
I don't like the idea of splitting type narrowing across two lines. I have several reasons for this, but perhaps the biggest one is that inserting a line in between would invalidate the typing.
For example:
if foo is Node2D:
var bar : Node2D = foo
ensures that bar
is a Node2D
, but
if foo is Node2D:
if randf() > 0.5 : foo = 0
var bar : Node2D = foo
doesn't work half the time. In principle, I suppose one could statically analyze the intervening code to determine whether all paths result in a valid bar
assignment, but seems overly complicated for what we are trying to achieve.
I think the solution for type narrowing should play nicely with match pattern guards if those are ever implemented. In fact, type narrowing can be seen as a special case of pattern guards. For example, if GDScript had pattern guards such as
match x:
var y if x > 0:
# code treating y as positive
then a natural way to do type narrowing would be
match x:
var y : int if typeof(x) == TYPE_INT and x > 0:
# code treating y as a positive int
Arguably, the typeof(x) == TYPE_INT
could be inferred from the declaration var y : int
, but I included it here to be explicit and show the relationship between type narrowing and pattern guards.
Describe the project you are working on
GDScript
Describe the problem or limitation you are having in your project
In the current implementation of GDScript, there is a limitation when it comes to type narrowing. Developers are unable to treat a variable as a more specific type within certain code blocks, such as if statements or match expressions. This limitation can lead to verbose and less readable code.
The safety of a type check is highly contextual. In segment A,
foo += 1
would seem to be what it is: an entirely safe operation. However, something as simple as adding a function call in B might violate that assumption. Perhaps it makesfoo
a string, meaning that the operation will no longer succeed. C transparently violates the assumption, while D is more nebulous; during the 1 second wait period, anything could modifyfoo
to be something else. The point being that it is not sensible to try to infer type safety based on an earlier check. Adding special cases such as assuming type safety directly after the check (E) is a workaround, but one that significantly increases the complexity of the analyser and is arbitrary.See "If this enhancement will not be used often ..." for workaround details.
A
B
C
D
E
Describe the feature / enhancement and how it helps to overcome the problem or limitation
Discussed on Rocket.Chat with several contributors. Hopefully the discussion will be enhanced here.
We can extend the already exceptional
match
syntax to complete type checking.Case syntax should be introduced where:
as
would be successful (i.e. not returnnull
for nullable types), the case block is executed."A", "B", "C":
case syntax already allowed. We could use commas in the same way, or pipes (|
) to conform with common type theory syntax. This might amount to pretending we have a feature which we don't (yet), however, and couldn't be combined with an explicit variable since it would not be type safe in current GDScript.Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
Option A
Option B
If used currently in GDScript with proposed syntax (option A):
If this enhancement will not be used often, can it be worked around with a few lines of script?
Due to the demonstrated type safety limitations with type checking, in order to enforce type safety, developers must manually cast variables a lot of the time. The resulting code is less maintainable and readable, due to being more verbose and specifying types manually; essentially a violation of DRY (don't repeat yourself).
Is there a reason why this should be core and not an add-on in the asset library?
This is core GDScript.