Open elegios opened 3 years ago
We discussed some during the meeting yesterday and noted that an alternative could be to put guards inside patterns instead, and I said I'd consider how that might look. Here're my conclusions/suggestions based on that.
We could introduce a new Pat
: <Exp> with <Pat>
. This would be a pattern that matches any value, but additionally evaluates the expression and tries to match it against the given pattern, which could either fail or succeed. For example:
match tm with TmApp app & (isValue app.f with true) then
...
By the semantics of &
we're first checking if tm
matches TmApp app
, then if it matches isValue app.f with true
. The latter pattern never looks at tm
, instead it evaluates isValue app.f
(note that app
is bound by an earlier pattern) and checks if the resulting value matches true
.
Given this change we could then also change match
; we could give it only a single pattern, and always match it against some dummy value, e.g., unit. The syntax would then be match <Pat> then <Exp> else <Exp>
, which feels some sort of clean.
First, mlang semantic functions would not support these patterns; they don't fit in the analysis at present (there might be a useful middle ground here, but I haven't thought about it yet). This feels fine to me.
Second, this does not work with our parser, this syntax is not LR(1). The issue is that LR(1) needs to know what it has just parsed when it sees the next token after it, but, e.g., Some (a, b)
is syntactically valid as both an Exp
and a Pat
, and we don't know which it should be until we see with
(it should be an Exp
) or then
(it should be a Pat
).
We could switch the order of Pat
and Exp
, make it <Pat> with <Exp>
(or <Pat> against <Exp>
, or w/e) instead. I think this is bad because of execution order and scoping; conceptually the expression is evaluated before we check it against the pattern (but the expression would be after the pattern), and the pattern can bind names that are not in scope in the expression (since that would be a cyclic dependency), despite the expression being "after" the pattern. As such, I don't really like this option.
This is the conservative change (that breaks all our code, depending on whether we change match
or not). We could add a keyword before a with
pattern, e.g., val <Exp> with <Pat>
. This fixes the LR(1) issue, and might also be a nice signal that this is a slightly unusual pattern.
Of course, if we make the change that match
only takes a pattern, that means that we have to rewrite all our matches:
- match foo with bar then
+ match val foo with bar then
There is at least one tool that could do this rewrite automatically that seems to handle it well, so making the change should be relatively simple.
There are two other variations here:
match
. This means that there are two constructs in the language for matching a pattern against an expression, though one can only be used inside the other.Pat
in a match to be a with
pattern and give it syntactic sugar, so match foo with bar then
would actually mean match val foo with bar then
.
match foo with bar | baz with blub then
is a syntax error, but match foo with bar | val baz with blub then
is valid, and might produce a different AST compared to match val foo with bar | val baz with blub then
, depending on the precedence between |
and val ... with
. This will of course go away once |
and &
become prefix, which is a change we've already decided on.This is the radical change that, surprisingly, I believe is backwards compatible. We could remove the Pat
syntactical sort, and instead put all its productions in Exp
(including the new with
).
This is slightly less weird than it may originally seem: patterns are designed to look like expressions, they're just limited to a subset of them. This way we'd parse an arbitrary expression in a pattern position, then check that it's just terms we can handle (mostly constants and composite literals) and/or transform to the normal Pat
AST type. I believe GHC uses this trick in its parser, so we'd not be the first. It's also possible that we could do something nice with our syn
s, have Pat
literally be an Exp
that is limited to certain constructors, though I don't think our current thoughts can quite express this at present.
In this world match
would look as follows: match <Exp> then <Exp> else <Exp>
.
For pattern things (|
, &
, with
) appearing in expression-position we could do the same (error on them/remove them). We could also be slightly less radical, and only have this merged syntactical sort in a match
: match <MergedExpPat> then <Exp> else <Exp>
, but I kinda feel like we could get nicer error messages if we just had one Exp
; a message saying that "|
is only allowed in patterns" might be clearer than "Unexpected token |
".
During the meeting yesterday we discussed a likely extension of our current
match
expression: the ability to perform several pattern-matches in sequence, using values bound in earlier patterns, then only evaluate thethen
branch if all matches succeed. This generalizes guards (it's essentially pattern guards in Haskell) and is quite simple to compile into. Lower your guards uses an equivalent form for their exhaustiveness and redundancy analyses so it should also be reasonable to do analysis on this form, though this is less obvious.Example (using
and
to separate each match):Without this feature guards are harder to encode since the failure case needs to be duplicated (or use some more fancy encoding):
However, we need to decide on a syntax for this feature. Suggestions thus far include:
and
. This makesand
a keyword, which necessitates the change of a standard library function (or an intrinsic, I forget) name. It also lightly implies the possibility of usingor
in the same place, which is not as obviously a good idea, and would probably require adding parentheses or some other grouping construct.next
(match e1 with p1 next e2 with p2 then ... else ...
). This makes the sequencing obvious, but it forms a kind of strange sentence.then
(match e1 with p1 then e2 with p2 then ... else ...
). This means that we don't know if we're reading an expression to be matched against or thethen
branch of thematch
until we seewith
orelse
, respectively, which might have implications on readability/parsing.,
(match e1 with p1, e2 with p2 then ... else ...
). This is very light syntax, for better or for worse, e.g.,p1
is closer visually toe2
thane1
, even though it's destructuring the latter. On the other hand, values bound inp1
are probably used ine2
, so they still have something of a connection.;
(match e1 with p1; e2 with p2 then ... else ...
). See the previous point.For the alternatives with a full word it's potentially easier to split them over multiple lines:
vs
or
Allowing a trailing separator could be nice for the non-word alternatives: