Open zspitz opened 6 years ago
One other point:
@zspitz I've written some examples, and a proposed spec in line with these goals here.
Thanks for thinking about this feature in detail and going as far as writing a grammar!
I think the goals are reasonable though I would add one more: keep it really really really simple.
The simplicity goal matters a lot because every feature is a product: it needs a spec, it has to be implemented in the compiler, implemented in the IDE, tested in all scenarios, documented (with samples) and then supported forever. This is a lot of work!
So to keep it simple, we should find the minimum viable functionality for this feature, hope that gets implemented and then expand from there in the future (otherwise none of it will ever get done in our lifetime).
From my own experience with the feature in C#, the type checking pattern is really all we need (and the C# docs seem to focus on mostly that). And we could probably do it without adding a new keyword and sticking to Is
, so borrowing from some of the examples you gave:
'test the type without introducing a variable (not strictly necessary but...)
If obj Is String Then Console.WriteLine("obj is a string")
'test the type and introduce a variable.
If obj Is s As String Then Console.WriteLine(s.Length)
'can also assign the boolean result to a local variable
'(useful for extending scope of variable "s" beyond block that checks test1/test2).
Dim test1 = obj Is String
Dim test2 = obj is s As String
'we should probably use AndAlso instead of When for simplicity.
If obj Is s As String AndAlso s.Length > 5 Then Console.WriteLine(s.Length)
'can do the above with Select too:
Select Case obj
Case Is String
Console.WriteLine("obj is a string")
Case Is s As String
Console.WriteLine(s.Length)
Case Is s As String AndAlso s.Length > 5
Console.WriteLine(s.Length)
End Select
'...and in loops that expect a boolean condition:
Do While obj Is s As String
Process(s)
obj = GetNextObject()
Loop
Do Until obj Is s As String
ProcessObjectsThatAreNotStrings(obj)
Loop
In short:
If for example the above functionality is all we got in order to keep things as simple as possible, does anyone feel something critical is missing? Remember: we can get really fancy with this, but if we do, it is unlikely to ever get implemented because it will be too much work!
CC: @KathleenDollard some food for thought here for the LDM as you consider this feature.
Generally I love it and would use this functionality constantly it would dramatically simplify code. Questions Can "If obj Is s As String Then Console.WriteLine(s.Length)" have an else case? If not, then how would "sets any introduced variable to Nothing", ever be tested or be useful? The variable would not be in scope. If there is an else what Type is "s" certainty not String.
In the Do Until, once you fall out of the loop "s" is not in scope so the assignment seems useless, so I am not sure if the assignment feature is useful or could be tested.
@paul1956 Can "If obj Is s As String Then Console.WriteLine(s.Length)" have an else case? If not, then how would "sets any introduced variable to Nothing", ever be tested or be useful? The variable would not be in scope. If there is an else what Type is "s" certainty not String.
I imagine it should work like this:
If obj Is s As String Then
'can use variable "s" here and it will be a non-null string.
Else
'can't use variable "s" here because it is not in scope.
End If
@paul1956 In the Do Until, once you fall out of the loop "s" is not in scope so the assignment seems useless,
That's a good point! It would be allowed for completeness much like assigning a variable to itself is allowed and equally useless. Though there could be some utility in the version that doesn't introduce a variable:
Do Until obj Is String
ProcessNonStringObject(obj)
obj = GetNextObject()
Loop
(An updated version is here.)
@ericmutta In thinking about what the minimum viable functionality would look like, I've reached the following conclusions (I apologize for the length).
TL;DR
Extend
Is
andCase
for pattern matching --<expression> Is <pattern>
andCase <pattern>
; instead of a dedicated keyword that introduces a pattern.The same syntax should work with both
<expression> Is <pattern>
andCase <pattern>
2a.
Is
should be extended to allow all currentCase
syntax --Case <expression> To <expression>
,Case <expression>, <expression>
; and should use value equality where available2b.
Case
should be extended for all currentIs
syntax --Case <expression>
should support reference equality and comparison againstNothing
(#119)Since
Case
andIs
already support simple expressions, anything until the end of the condition /Case
clause must be considered part of the pattern, to prevent ambiguity.3a. The only way to allow for additional logic AFAICT is to wrap it in a single-per-pattern-expression
When
clause.3b. Trying to reuse
AndAlso
for this purpose will only increase complexity.
Case
should continue to not requireIs
.(Possible grammar at the end.)
If pattern matching is just another type-checking syntax (admittedly also allowing introduced variables), then I don't think the feature can possibly justify itself. (In fact, the compiler could do the same thing without any additional syntax.)
But pattern matching is so much more than type-checking -- it's saying:
From my own experience with the feature in C#, the type checking pattern is really all we need (and the C# docs seem to focus on mostly that).
Pattern matching in C# is still in its infancy; see the F# documentation for a better idea of the range of potential patterns.
Having said that, you are quite correct -- it's going to be harder to add all the possible patterns right at the beginning. However, any first steps in pattern matching must specify at least two things:
It also should allow for the goals outlined above, even if these goals aren't implemented immediately; and should also be compatible and consistent with existing syntax.
AFAICT there are two possible ways to insert pattern matching into the language:
Matches
)Case
and Is
I prefer extending Case
/Is
, because:
Case
already has a (albeit very limited) form of pattern matchingIs
in the conditions of If
statements and the like: IsNot
. For a dedicated keyword, it would be necessary to choose between no inverse -- If Not o Matches String Then
-- or something equally awkward..although a dedicated keyword might be simpler to implement, because we wouldn't have to deal with the historical usages of Is
and Case
.
If we extend Case
/Is
, we shouldn't require Is
for patterns in Case
clauses. Forcing Is
to be required, would result in three different rules for whether Case
needs to be followed by Is
:
Case <expression>
-- the Is
cannot be used, per the specCase Is <operator> <operand>
-- the Is
is optional, also per the specCase Is <pattern>
-- the Is
is requiredwhich I think would be very confusing.
Also, assuming we extend Case
/Is
, since VB.NET already supports both Case <expression>
and Is <expression>
, any other expressions until the end of the condition / Case
clause must be considered part of the pattern. Otherwise the following would be ambiguous:
Dim o As Object
Dim foo As Boolean
Dim bar As Boolean
If o Is foo AndAlso bar Then
Should foo AndAlso bar
be evaluated first? Or should o Is foo
be evaluated first?
C# doesn't have this problem, because C# doesn't allow case
with an expression, so the following:
object o = null;
bool bar = true;
if (o is bool b && bar)
is unambiguous -- first perform the pattern match, then the logical AND.
(NB. OTOH, this does simplify things a little, because we don't need an explicit literal pattern.)
If a pattern expressions must be greedy (i.e. go to the end of the condition / Case
clause), then extra logic at the end of the pattern cannot use a simple boolean expression; we need a special clause for this purpose. C# introduces this clause with the when
keyword.
Specifically in VB.NET, we can't simply reuse an existing keyword such as AndAlso
to introduce this clause, because AndAlso
might be part of the expression in the pattern.
We could make up all sorts of complicated rules to disambiguate, but the simplest would be to introduce a new keyword -- When
.
(NB. A similar usage of When
already exists in VB.NET, when applying a condition to a Catch
exception.)
Taking into account the need to define the following from the start:
Case
and Is
I think the following is the "minimum viable functionality":
BooleanOrPatternExpression
: BooleanExpression
| Expression 'Is' PatternExpression
;
// If...Then..ElseIf blocks
BlockIfStatement
: 'If' BooleanOrPatternExpression 'Then'? StatementTerminator
Block?
ElseIfStatement*
ElseStatement?
'End' 'If' StatementTerminator
;
ElseIfStatement
: ElseIf BooleanOrPatternExpression 'Then'? StatementTerminator
Block?
;
LineIfThenStatement
: 'If' BooleanOrPatternExpression 'Then' Statements ( 'Else' Statements )? StatementTerminator
;
// Loops
WhileStatement
: 'While' BooleanOrPatternExpression StatementTerminator
Block?
'End' 'While' StatementTerminator
;
DoTopLoopStatement
: 'Do' ( WhileOrUntil BooleanOrPatternExpression )? StatementTerminator
Block?
'Loop' StatementTerminator
;
// cannot introduce variables in to child scope here
DoBottomLoopStatement
: 'Do' StatementTerminator
Block?
'Loop' WhileOrUntil BooleanOrPatternExpression StatementTerminator
;
ConditionalExpression
: 'If' OpenParenthesis BooleanOrPatternExpression Comma Expression Comma Expression CloseParenthesis
| 'If' OpenParenthesis Expression Comma Expression CloseParenthesis
;
// Within a Case clause
CaseStatement
: 'Case' PatternExpression StatementTerminator
Block?
;
PatternExpression
: Pattern ('When' BooleanExpression)?
;
Pattern
// patterns with subpatterns
: Pattern ',' Pattern // OR pattern (already supported in Case)
// patterns without subpatterns
| 'Of' TypeName // Type check pattern -- matches when subject is of TypeName
| Identifier 'As' TypeName // Variable pattern -- introduces a new variable in child scope
| 'Is'? ComparisonOperator Expression // Comparison pattern
| 'Like' StringExpression // Like pattern
| Expression 'To' Expression // Range pattern
| Expression // Equality pattern -- value/reference equality test against Expression
;
Ping @bandleader
If the powers in charge can agree on a grammar and ultimate feature set, it would be nice to get "Is TypeName" and "Is Identifier As TypeName" out first. If all the things being proposed this is the one thing needed yesterday and would simplify lots of VB Code..
@zspitz Otherwise the following would be ambiguous:
Dim o As Object
Dim foo As Boolean
Dim bar As Boolean
If o Is foo AndAlso bar Then
It isn't ambiguous because it doesn't compile at all! Compiler says Is operator does not accept operands of type Boolean. Operands must be reference or nullable types
.
Ultimately the choice between AndAlso
and When
is likely to be a matter of preference, mainly because both keywords already exist in the language so nothing new is being added, regardless of which you choose. I personally prefer 'AndAlso' mainly because it makes clear the short-circuiting nature of the expression (i.e the part to the right of AndAlso
will not run if the type-check and conversion on the left did not succeed).
@zspitz I've reached the following conclusions (I apologize for the length).
I believe that when it comes down to implementing this, the team will appreciate the level of detail in your comments, it is clear you have given this a lot of thought! :+1: :+1:
@ericmutta
Thanks for the kind words; I hope you're right, and this will enable the team to move forward on pattern matching that much faster.
It isn't ambiguous because it doesn't compile at all!
If o Is foo
currently doesn't compile, as you've noted. But if we extend Is
to follow everything Case
does today, then If o Is foo Then
would compile just fine.
But even then, trying to push more boolean expressions onto the pattern would be ambiguous.
o Is foo AndAlso bar
-- It isn't ambiguous because it doesn't compile at all! Compiler says Is operator does not accept operands of type Boolean. Operands must be reference or nullable types.
@ericmutta In addition to what @zspitz noted above -- it's nevertheless ambiguous in parsing. The fact that bar
is Boolean is not known until the binding stage of the compiler. (Also, compilers should anyway never parse differently based on semantics.)
This distinction (between lexical and semantic analysis -- done by the parser and binder respectively) is often overlooked in these discussions. More about that later.
1) Another issue with extending Is
to work with patterns and supporting an expression as a pattern: Is
currently checks for reference equality, whereas expressions would presumably check for equality including IEquatable equality/.Equals (and it would have to because that's what Case
does as well). So obj Is otherObjWhichEquatesToObj
would suddenly have to return True. Aside from breaking existing code, there's also the fact that VB does need a reference equality operator.
2) Problem: Case x
currently does a value check. Case x As String
is being proposed to do a typecheck-and-assign. This is very non-intuitive.
(For comparison, the following two lines do the same thing both lexically and in human understanding, just with specific types vs. type inference. This is critical for intuition and should be a hard requirement for any VB syntax which can take an As T
or skip it.)
For Each x In collection
For Each x As String In collection
Same for:
Dim x = expr
Dim x As String = expr
I have some more thoughts against re-using Select Case
and Is
as-is for pattern matching, or at least for this pattern (typecheck-and-assign), even while realizing the similarity between them and the desire to mesh it all together. I'll try to post later.
@zspitz Most importantly -- thank you for your amazing efforts in moving this forward. Here's hoping MS will take notice and reciprocate!
Another issue with extending Is to work with patterns and supporting an expression as a pattern: Is currently checks for reference equality, whereas expressions would presumably check for equality including IEquatable equality/.Equals (and it would have to because that's what Case does as well). So obj Is otherObjWhichEquatesToObj would suddenly have to return True. Aside from breaking existing code, there's also the fact that VB does need a reference equality operator.
Agreed. I think we'd need a dedicated keyword for boolean contexts If <expression> Matches <pattern> Then
, Do While <expression> Matches <pattern>
etc. (I'll update my original post accordingly.)
But I still think we should extend Select Case
to patterns in general, without an additional keyword. Otherwise users will have to decide whether to use Case Matches <pattern>
or Case <expression>
, each with its own limitations.
In general, I like letting the discussion flow without interrupting it.
This has been a great discussion. I appreciate everyone's work, and wanted to be sure you all didn't think you were talking into a void. It's also the start of Christmas holidays, which means a lot of folks are out until the first of the year.
@KathleenDollard Appreciated, but I dunno; I would much prefer if you would indeed chime in: participation is not an interruption! It would be wonderful to see participation from other LDT members as well, even if it's one person who knows Roslyn well, and even better if he/she can ask other busy team members for their opinion.
@KathleenDollard To add to what bandleader said, I myself have almost no knowledge of how Roslyn works, or even compilers in general; both of which would inevitably preclude some design choices, while enabling others. I am only writing from my day-to-day usage of VB.NET and C#, and a little dabbling in F#. I think that some input from someone with Roslyn experience would be helpful in not barking up the wrong tree, or to confirm that a given design choice is relevant.
(This is a rewrite of the initial required functionality for pattern matching above. Thanks for everyone's help in clarifying this; in particular, the discussions I've had with @bandleader have been extremely illuminating.)
Is
cannot be extended, as it's currently used for reference equality; we'll require a dedicated keyword for boolean contexts -- e.g. If <expression> Matches <pattern> Then
, Do While <expression> DoesntMatch <pattern>
. However, we should extend Case
to use Case <pattern>
without an additional keyword.<expression>
should be a valid pattern, using value equality where available, and reference equality if not. (This would have the same effect on Case
as #119).Case
today should also be a pattern:
<pattern>, <pattern>
-- OR pattern<expression> To <expression>
Like <string expression>
[Is] <comparison operator> <expression>
For newcomers to pattern matching who may not be familiar with the full range of potential available patterns, the most obvious patterns are those that relate to type-checking and variable-introduction. The initial release of pattern matching should therefore include these three patterns or pattern variants:
Figuring out the syntax for these is a separate issue (#367). But whatever the syntax, it has to read well in nested patterns as well, even if currently the only nested pattern is the OR pattern.
Case
already supports expressions combined with operators, we must consider anything until the end of the condition / Case
clause as part of the pattern.
When
clause.AndAlso
because it is an operator used currently in expressions and wouldn't clearly delineate the end of the pattern itself. When
parallels C#'s when
and cannot be an operator in an expression (it's currently valid only in Catch
clauses).Case
supports arbitrary expressions, any pattern must almost always be invalid as an expression, to allow reliable disambiguation. This is easily done by using some keyword to introduce the pattern, such as Case Of <typename>
for a typecheck pattern, or If o Matches Dim x As Integer Then
as a variable+typecheck pattern. Case
should continue to not require Is
. Matches
could be paralleled by DoesntMatch
. Perhaps Case
should not allow non-matches? There is precedent for this -- Case Is
can currently be used, but not Case IsNot
. Is
because comparing two objects which have a default property would result in both being converted to their respective values via the default property, and value-equal comparing the results; it's then impossible to compare reference equality for the two objects. How would this issue be handled within pattern matching on the expression pattern?DoBottomLoopStatement
, should we allow variables introduced with Loop While <expression> Matches <pattern>
to bleed back into the body of the Do
? Seems very counterintuitive.And the current state of the grammar:
BooleanOrPatternExpression
: BooleanExpression
| Expression 'Matches' PatternExpression
;
// If...Then..ElseIf blocks
BlockIfStatement
: 'If' BooleanOrPatternExpression 'Then'? StatementTerminator
Block?
ElseIfStatement*
ElseStatement?
'End' 'If' StatementTerminator
;
ElseIfStatement
: ElseIf BooleanOrPatternExpression 'Then'? StatementTerminator
Block?
;
LineIfThenStatement
: 'If' BooleanOrPatternExpression 'Then' Statements ( 'Else' Statements )? StatementTerminator
;
// Loops
WhileStatement
: 'While' BooleanOrPatternExpression StatementTerminator
Block?
'End' 'While' StatementTerminator
;
// introducing variables with Until could only be used
// by the When clause, not within the block
DoTopLoopStatement
: 'Do' ( WhileOrUntil BooleanOrPatternExpression )? StatementTerminator
Block?
'Loop' StatementTerminator
;
// introducing variables with either While or Until could only be used
// by the When clause, not within the block
DoBottomLoopStatement
: 'Do' StatementTerminator
Block?
'Loop' WhileOrUntil BooleanOrPatternExpression StatementTerminator
;
ConditionalExpression
: 'If' OpenParenthesis BooleanOrPatternExpression Comma Expression Comma Expression CloseParenthesis
| 'If' OpenParenthesis Expression Comma Expression CloseParenthesis
;
// Within a Case clause
CaseStatement
: 'Case' PatternExpression StatementTerminator
Block?
;
PatternExpression
: Pattern ('When' BooleanExpression)?
;
Pattern
// patterns with subpatterns
: Pattern ',' Pattern // OR pattern (already supported in Case)
// patterns without subpatterns
| 'As' TypeName // Type check pattern -- matches when subject is of TypeName
| 'Dim' Identifier ('As' TypeName)? // Variable pattern -- introduces a new variable in child scope; as TypeName or Object
| 'Is'? ComparisonOperator Expression // Comparison pattern
| 'Like' StringExpression // Like pattern
| Expression 'To' Expression // Range pattern
| Expression // Expression pattern -- value/reference equality test against Expression
;
Pinging @KathleenDollard @ericmutta @paul1956
Notes from our meeting on this Wednesday
You all are awesome.
@kathleendollard thanks for the update from the LDM!
I think this kind of iteration and feedback loop between the community and the LDM is really awesome and should become the way to do things going forward, that is:
1) LDM shares what they are considering (this is important and helps the community focus on things that have the highest probability of happening before the universe dies).
2) community rallies around that and talks about it to flesh it out.
3) then LDM comes back with feedback and own thoughts.
4) rinse and repeat until something epic happens.
You are all awesome (especially having language design meetings so close to X-Mas!)
@bandleader Another issue with extending Is to work with patterns and supporting an expression as a pattern...
I can see now that using Is
could be more trouble than it's worth because Is
already has several uses. What about Like
though? The only thing this operator has done since day one is pattern matching, which is exactly what we are talking about here!
If obj Like String Then Console.WriteLine("obj is a string")
If obj Like s As String Then Console.WriteLine(s.Length)
Dim test1 = obj Like String
Dim test2 = obj Like s As String
If obj Like s As String When s.Length > 5 Then Console.WriteLine(s.Length)
Select Case obj
Case Like String
Console.WriteLine("obj is a string")
Case Like s As String
Console.WriteLine(s.Length)
Case Like s As String When s.Length > 5
Console.WriteLine(s.Length)
End Select
Do While obj Like s As String
Process(s)
obj = GetNextObject()
Loop
Do Until obj Like String
ProcessObjectsThatAreNotStrings(obj)
Loop
...something to consider before introducing an entirely new keyword and creating a scenario where we have two keywords doing pattern matching :+1: It also seems to negate naturally and do ranges/tuples quite nicely:
If num Like 1 To 10 Then Console.WriteLine("number between 1 to 10")
if num Not Like 1 To 10 Then Console.WriteLine("number NOT between 1 to 10")
if MyTuple Like (String, Integer) Then Console.WriteLine("tuple of string and integer")
@ericmutta What about
Like
though?
I can't believe we didn't suggest this earlier! By extending its usage to include pattern matching, Like
can be the perfect keyword short of introducing a new one.
Thanks to the members of the LDT for discussing this, and all your hard work on VB.NET; and for taking into account the community's contribution.
(I'm responding here to the meeting notes; @KathleenDollard if there's a better place to put it please let me know.)
(I'm addressing this first, because it affects some subsequent points.)
It's not clear how variable introduction without typechecking would work, or how type checking without assignment differs from the available TypeOf x Is
.
Without nested patterns, there is indeed no difference. But with nested patterns -- patterns composed of other patterns -- and the pattern itself doesn't sufficiently enforce the type of the item in question, we may want to enforce a specific shape of parts of the item without having to explicitly name those parts. For example, with the tuple pattern:
Dim o As Object
Select Case o
Case (Integer, Integer)
Console.WriteLine("Pair of numbers")
Case (String, String)
Console.WriteLine("Pair of strings")
End Select
If every type check also requires variable introduction, we have the needlessly cluttered:
Dim o As Object
Select Case o
Case (Integer Into x, Integer Into y)
Console.WriteLine("Pair of numbers")
Case (String Into s1, String Into s2)
Console.WriteLine("Pair of strings")
End Select
It's not clear how variable introduction without typechecking would work
With nested patterns, the converse is also true -- I may want to extract part of the root matched value, without assigning a new name to the root. Using the array literal pattern (#141):
Select Case o
Case {Into firstArg, String, String} When firstArg = "help" Or firstArg = "version"
Console.WriteLine($"First argument -- {firstArg}")
Case Else
Console.WriteLine("Invalid first argument")
End Function
Not all of us are happy with the reading of the If syntax. The human English wording would be more like "If o matches string, put it in x as a string."
@bandleader has also pointed out that Case x As String
is rather counter-intuitive, because everywhere else in the language, omitting the As String
doesn't change the basic meaning of the statement; both the following statements declare a variable, albeit of different types:
Dim x
Dim x As String
The same applies for method parameters:
Sub Foo(bar As String)
Sub Foo(bar) ' defaults to parameter of type Object
and in all other places where a variable is introduced into a child scope -- For
, For Each
, Using
, and Catch
.
However, the following two Case
s mean very different things:
Case x ' is the case value equal to the already-existing expression `x`?
Case x As String ' `x` doesn't exist; if the case value be assigned to a String, introduce a new `x`
I think Into
is an excellent choice, because as noted the variable introduction (Into x
) follows from the
typecheck (Case String
). In addition, Into
already has a similar usage when inline-declaring a variable for a grouped or aggregate LINQ query.
(NB. What happens if there is an identifier String
in scope? Would it be better to disambiguate with some keyword: Case Of String Into x
or Case As String Into x
?)
This will lead to a discussion about whether it is more important to read like English or to look like a declaration here.
I don't think "looking like a declaration" has inherent value. The only reason I can see to prefer Case Dim x As String
over Case x As String
is in order to visually distinguish from Case x
; Case String Into x
does this equally well, if not better.
2. We are a little confused. The effect of #119 seems desirable, but not sure whether this is a pattern or evaluation (or what distinctions matter here).
This was only relevant if 1) using Is
as the pattern-match operator in boolean contexts, 2) everything supported by Case
is a pattern, 3) and the <expression>
pattern would have the same behavior in both Is
and Case
. Since Is <expression>
tests for reference equality, Case <expression>
would also test for reference equality, and <expression>
would be considered a special case of <pattern>
.
Since 1) the LDT has decided on Matches
for boolean contexts and, 2) some syntaxes matched by Case
will not be patterns, this no longer appears relevant to pattern matching. (It's nice to have independent of pattern matching though.)
3. We thought about commas...
Agreed.
Certainly everything that works in the context of a Case today should continue to work in a Case. But hesitate on moving syntax from Case to other places patterns can be used.
With nested patterns, if an expression is considered a special case of pattern, it becomes possible to write something like this (e.g. using the tuple pattern):
Dim o As Object
Dim x = 5
If o Matches (x, x+1, x+2) Then
4. The linked "full range of potential patterns" is for F# and several of these are not available in other .NET languages.
I didn't mean to imply that all these patterns should be in VB.NET; only that while typecheck+variable is the big draw for those who have never used it, pattern matching is a far more generalized idea than just typecheck+variable. #367 contains a list of patterns that might be of value specifically in VB.NET.
6. Can we get clarity on this. Is this basically saying there can't be ambiguity and we can't break existing code?
A rather specific ambiguity. For patterns which mean a literal expression in other contexts (e.g. the tuple pattern), since Case
supports any expression, it is necessary to distinguish between Case <expression>
and Case <pattern>
. (C# doesn't have this problem, because initially only constants were supported by switch
.)
(This may not be an issue when the type of the resulting literal expression is a value type. For example, even though this is ambiguous:
Dim t = (1, 2)
If t Matches (1,2) Then
between:
but it doesn't matter; since a ValueTuple
is a value type, multiple instances of ValueTuple
are the same as long as their members are the same.)
7. Do you mean Case should not require Is where it does not require it today?
This is about using Case Is <pattern>
to distinguish between matching against a pattern, and Case <expression>
to check value-equality on an expression. The Is
in Case
would then be required (for patterns), optional (for comparison operators), or disallowed (for simple expressions), based on context. Really confusing.
8 iii) Need clarity on what this is saying
I made a mistaken assumption here; it's irrelevant.
// introducing variables with either While or Until could only be used // by the When clause, not within the block // LDM thinks probably within the block as well
This was a typo -- when introduced with a While
variables should be usable within the block, but when introduced with an Until
, variables should not be available within the block.
@KathleenDollard
Emphasizing one additional point:
For background: a pattern is not an expression, but a thing that when matched results in an expression, in this context a Boolean expression.
A pattern may not be an expression, but every expression could be considered a pattern that matches on value-equality (o. With recursive patterns, this would enable using expressions as sub-patterns:
Dim x = 5
Select Case o
Case (x, x+1)
End Select
(Post-LDM spec, incorporating Into
for data extraction)
Pattern matching has two goals:
BooleanExpression
or Case
block)Into
identifier)The following is a possible updated grammar, using the LDM-suggested Into
for data extraction. In addition, there's a description of the scope rules for Into
-introduced variables; as well as when they are considered initialized or not. Some additional points at the end.
// Redefine CaseStatement as using PatternClauses instead of CaseClauses
CaseStatement
: 'Case' PatternClauses StatementTerminator
Block?
;
// Redefine BooleanExpression as possibly a match against a pattern
// All the usages of BooleanExpression -- If Then, Do While etc. -- remain the same
BooleanExpression
: Expression 'Matches' PatternClause
| Expression
;
// CaseClauses and CaseClause are no longer needed
PatternClauses
: PatternClause ( Comma PatternClause )*
;
PatternClause
: PatternOrInto ('When' BooleanExpression)?
;
// Into introduces a new identifier holding the value matched by the rest of the pattern clause (Pattern + When)
PatternOrInto
: Pattern ('Into' Identifier)?
| 'Into' Identifier
;
MemberPattern
: '.' Identifer Equals Expression
| '.' Identifier `Matches` Pattern
;
Pattern
// general nested patterns
: 'AnyOf(' PatternClauses ')' // OR pattern
| 'AllOf(' PatternClauses ')' // AND pattern
| 'NoneOf(' PatternClauses ')' // Multiple-pattern negation
| 'Not(' PatternClause ')' // Single-pattern negation
| '{' PatternClauses '}' // array pattern
| '(' PatternClauses ')' // tuple pattern
| 'With {' MemberPattern (, MemberPattern)* '}' // With pattern
// non-nested patterns
: TypeName
| '*' // Discard pattern
| 'Is'? ComparisonOperator Expression // Comparison pattern
| 'Like' StringExpression // Like pattern
| Expression 'To' Expression // Range pattern
| Expression // Expression pattern -- value/reference equality test against Expression
;
Note that @AnthonyDGreen discusses a syntax for rest of the array pattern; it could be applied to other similar patterns such as the tuple pattern.
Into
-introduced variablesThe Into
-introduced variable should be in scope for the When
clause.
For a pattern defined within a CaseExpression
, the scope of the Into
-introduced variable should (at least) be the body of the Case
.
Select o
Case String Into s
Console.WriteLine(s)
End Select
If the pattern is defined in a BooleanExpression
, and the BooleanExpression
is part of the test of a block (Do While ...
, If ... Then
etc.), the variable should certainly be in scope for the child block immediately following the test:
If o Matches String Into s Then
Console.WriteLine(s)
End If
Do While o Matches String Into s Then
Console.WriteLine(s)
Loop
(RE: if the Else
block should be in scope see open question #2 at the end.)
Into
-introduced variablesThe variable should be initialized for when the match is successful:
Do While o Matches String Into s
Console.WriteLine(s)
Loop
but not when the match is not successful:
Do Until o Matches String Into s
' Should report here "A variable has been used before it has been assigned a value."
Console.WriteLine(s)
Loop
or when it might not have been successful:
Do
' Should report here "A variable has been used before it has been assigned a value."
' because the match hasn't succeeded until after the first iteration
Console.WriteLine(s)
Loop While o Matches String Into s
As follows:
Construct | Guaranteed initialization |
---|---|
If .. Then If .. Then .. Else If .. Then .. End If If .. Then .. Else .. End If If(..) operator |
Within Then part / block |
Do While .. Loop While .. End While |
Within the loop |
Do Until .. Loop Do .. Loop Until |
Never |
Do .. Loop While |
No guarantee, because of the first interation |
BooleanExpression
like this?Dim rnd As Random Or Nothing
), then I think it should be disallowed. If the syntax resembles nullable value types (Dim rnd As Random?
), then perhaps it should behave as some kind of pseudo-type, and the pattern o Matches Random?
should desugar to o Matches AnyOf(Random, Nothing)
. Any Into
-introduced variables would also have to be similarly desugared: o Matches Random? Into r
-> o Matches AnyOf(Random Into r, Nothing Into r)
; and the compiler would treat it as an exception to (3) below. TBH it feels like more trouble than it's worth. AnyOf
/ AllOf
suggestion.Into
should be disallowed within the parts of the OR pattern, because there's no way to use an identifier which doesn't come from all the OR's sub-patterns.
o Matches AnyOf(String Into s, Integer Into i) ' Either s or i is uninitialized
Unless all parts of the OR define the identifier with the same type:
o Matches AnyOf(1 To 10 Into i, (Integer Into i, String))
' On either side of the OR, i refers to an Integer, and is initialized
The selling point for When
is that it offers room to put additional logic to customize the pattern:
Select o Case String Into s When DateTime.Now.Hour > 6 End Select
It's only natural that the Into
-introduced variable be in scope and initialized within the When
clause:
Select o Case String Into s When s.Length > 0 End Select
But for BooleanExpression
containing a pattern:
If o Matches String Into s When s.Length >0 Then End If
because we could extend the scope of the Into
-introduced variable to the end of the BooleanExpression
, and then use AndAlso
:
If o Matches String Into s AndAlso s.Length >0 Then End If
allowing When
may not be necesary. FWIW, this is what C# does. instead of allowing When
as part of the pattern as described in the above spec. The following will not compile:
if (o is string s when s.Length >0) { }
In short, the two choices AFAICT are:
BooleanExpression
, orInto
-introduced variables until the end of the parent BooleanExpression
, allowing AndAlso
clauses to make use of the variableShould the Else
block also be in scope for an Into
-introduced variable from the test of the starting If
block? And, if the If
has a NOT pattern, should the variable also be initialized?
If Not(o Matches String Into s AndAlso s.Length >0) Then ' s is in scope, but uninitialized Else Console.WriteLine(s) End If
It might be simpler just to say that the logic should be flipped around in this case. But VB has a history of not requiring the logic to be adjusted -- Do While Not can be expressed in terms of Loop Until; as can Do ... Loop While Not in terms of Do ... Loop Until.
Similarly, if a NOT pattern is used in one of the Case
blocks, should the variable be in scope and initialized in the subsequent Case
s?
Select o
Case Not(String Into s)
Case Else
Console.WriteLine(s)
End Select
What about Do Until Not(String Into s)
?
Ping @ericmutta @franzalex @paul1956 @bandleader @AnthonyDGreen @AdamSpeight2008 @gilfusion on the implications of Into
-introduced variables in pattern matching, including scope and initialization.
@Happypig375 Yes, but not Not( ... )
.
@Happypig375 It's really a broader question -- which takes precedence, an expression or a pattern? I guess in order to avoid breaking changes, an expression should take precedence.
At least for Not
, there is another possible resolution; allow NoneOf
to apply to a single pattern as well, with the same effect.
Alternatively, even though Not(0)
is a breaking change, it's a very small one.
@zspitz this is a very lively and thorough discussion which would be awesome if any of it had a chance of being implemented, especially given the recent announcement that there are no plans to evolve the language anymore.
Until we as a community figure out the language evolution problem, I am afraid the vblang repo won't have much use and any discussions will be purely theoretical!
@ericmutta Agreed. But I hope things will be resolved eventually for the better, and at that point this discussion will become very useful. Also, I need to get the details of pattern matching out of my system, so I can continue working on something else.
@zspitz But I hope things will be resolved eventually for the better...
I salute your optimism :muscle:
I've been thinking about the way TypeScript is implemented (it compiles down to JavaScript) and perhaps we as a community could develop a VB pre-processor that compiles *.vbx files. These files would contain a superset of VB that is translated into the current VB language which is then compiled by Rosyln.
Much of the stuff we would like to see in VB (including this pattern matching business) boils down to syntactic sugar that could be implemented "the TypeScript way" (i.e by a pre-processor extension). If others like this concept perhaps we can create a seperate issue and discuss further :+1:
@ericmutta That will also need editor support and tooling support.
@tverweij indeed! I am glad you read the comment above, I was trying to tag you yesterday but it didn't work when commenting from my phone.
I remember you have started working with a company that has experience with programming languages and tools development. Perhaps this is something they could consider?
Rather than an entirely new IDE and change in name and a $500 price tag, if they could do it "the TypeScript way" (i.e as a Visual Studio extension that pre-processes VBX into VB) and charge say $99, I reckon most people would just buy the extension without thinking too hard.
The extension shouldn't introduce complex changes (e.g. the type system should be left alone), it should leverage Rosyln as much as possible, and the focus should be on the syntax sugar stuff like the pattern matching being discussed here.
Please talk to RemObjects and let us know whether this makes business sense for them :+1: :pray:
@ericmutta: I read everything here. And see what we can adopt and what not.
The extensible Idea from @VBAndCs and now from you is a nice idea, but almost not implementable. At least not for us in the existing toolchain where Mercury is being added.
But three things: First: we are going to extend the language - much. What is added and what not is not decided yet. But this one has a really big chance although it will be implanted the same as the C# implementation. Second: We have AOP for all languages for custom code generation. So this will work for Mercury too. Third: With Mercury, forget about Roslyn. This is not a Microsoft language, so there is no Roslyn compiler.
But first we have to come to the point where we are on par with Visual Basic. We are working hard on that now.
@ericmutta: That idea would also destroy the reason of the Mercury project. We started well before Microsoft declared VB dead/zombie. The goal of the project is 2-fold:
Your idea would solve 1. but not 2. as the compiler would still be the limited Roslyn VB compiler. So adding thing like unsafe, inline, lazy and reference returns would still be impossible.
@ericmutta - last part, I promise :-)
About 2. - We are already compiling for MacOs (Cacoa), Linux (native), Windows (native), .Net, .NetCore and WebAssembly with the partial implemented version.
Much of the stuff we would like to see in VB (including this pattern matching business) boils down to syntactic sugar that could be implemented "the TypeScript way" (i.e by a pre-processor extension).
@ericmutta A pre-processor like Typescript needs a lexer, parser, [binder,] lowerers, and a source emitter. In other words, much of the components of a compiler like Roslyn, with the most significant difference being a source emitter instead of an IL emitter. Thus, such a pre-processor should likely re-use Roslyn for the lexing, parsing, binding, etc. Once doing that, why not build your changes into Roslyn -- isn't that less difficult?
Of course the difficulty would be getting MS to accept those changes, when it wants to minimize the resources it spends on VB. I'm not sure they could be convinced to let the community maintain VB (including docs, etc.) just like F#.
Another option would be simply forking Roslyn and configuring VS to use that forked version. @AnthonyDGreen once described how to do that here.
Another option would be simply forking Roslyn and configuring VS to use that forked version. @AnthonyDGreen once described how to do that here.
Or build a language server that could choose between the original and the fork, and use VS Code or some other editor.
@tverweij ...last part, I promise :-)
Many thanks for the clarification :-) I see the goals for Mercury are much broader than what I was thinking about and it would be interesting to see how that works out (competing against popular, high quality free tools from a trillion dollar corporation is not going to be easy but somebody has to try :-))
@bandleader A pre-processor like Typescript needs a lexer, parser, [binder,] lowerers, and a source emitter.
That's the trouble with this whole situation: language tools need a lot of work and its highly unlikely to be done (effectively) on a part-time basis by people who have day jobs, mortgages and no experience in the field of compilers. The barriers to entry are high even before worrying about what Microsoft will or will not accept.
Unless @AnthonyDGreen's articles (thanks @aarondglover for sharing) miraculously reach the powers that be and convince them to keep evolving VB, I think the course taken by @tverweij with project Mercury is the only potential alternative for now. So hustle hard @tverweij you may very well save the day :muscle:
What features / attributes should a VB.NET pattern matching syntax have?
I propose the following:
If ... Then
,Do While ...
blocks and the like.Case
block or the block following the match expressionWhen
clauses for additional conditionsI've written some examples, and a proposed spec in line with these goals here. But the purpose of this issue is to discuss whether these goals are valid and relevant.