Closed alehander92 closed 3 years ago
I have nothing in particular to add, except for the fact that pattern matching is often done on algebraic data types. Nim today has almost algebraic data types, via variant objects, but:
If pattern matching lands in stdlib, it would be nice if at least the second of these issues is solved. If I remember correctly there is another RFC for that.
Other than this, I would be happy with just putting gara in the stdlib, as it seems the most complete option today
Ideally, pattern matching should work well with AST manipulation, since recognizing structural patterns in the AST tends to be 50% of a macro's job.
That being said, I've only ever needed the kind of pattern matching provided by something like py-good. I can't recall ever having needed it anywhere else.
A lot of popular programming languages (almost all with exception of Go/C that position themselves as "simple" languages with very few functional features) either already have support for pattern matching or existing proposals.
I'm not a big functional programmer myself, but i think Nim is already a really good functional programming language - closures, sum types, optional types etc. Support for custom operators and UFCS makes transition even more seamless. Pattern matching is an important addition to the list if features that would allow nim to position itself as a functional programming language.
caseStmtMacro
I think pattern matching in stdlib should be implemented using caseStmtMacros
that are currently experimental, but quite usable (I tried implementing pattern matching and there was no immediate issues). The reason - this way pattern matching would be just an extension of case
statement and won't introduce any new concepts/keywords to the language. Can be added as another feature in sugar
module.
Second - I think it is important to keep list of supported features relatively small - e.g. no predicate validation (match field only if it satisfies some kind of predicate), unification etc. The reason - due to potential number of extensions to concept of pattern matching it is very easy to end up with feature creep where macro creates whole new semantics for case statement. Although some additions to pattern matching would certainly be useful. For example - allowing abitrary predicate extensions seems like a little too much for me, but checking if value is in set kind: in {enEE, enEE1}
will certainly be useful in a lot of situations.
It is also important to not require any modifications to existing types - reuse existing len
and `[]`
overloads as well as kind
fields. This makes certain assumptions about what functions are implemented for object, but I would argue this is a good thing. E.g. further promoting idea that all int-indexable (arrays, sequences etc.) shoud implement `[]`
and len
, all case objects should have kind
etc. as it would only lead to more consistent API across all code.
Pattern matching on key-value tables (json for example) might be really practical in a lot of scenarios. Possible syntax:
assertEq "000", case %{"hello" : %"world"}:
of {"999": _}: "nice"
of {"hello": _}: "000"
else: "discard"
NimNode
One of the main use cases for pattern matching in nim is macros. NimNode
probably the most used case object and large portion of macro is just re-parsing of input structures. Pattern matching should account for this.
In addition to making nim node pattern matching easier it will also make macros more declarative (for simple macros): it would be easier to understand what kind of input is expected and what will be generated from it.
I'm currently working on my own pattern matching library based on case statement macros. You can see example tests here - https://github.com/haxscramper/hmisc/blob/master/tests/tMatching.nim
I originally wanted to PR it as a stdlib module but then decided to first ask for feedback on general idea/syntax. Right now it is a little over-engineered and has dependencies on other part of hmisc
. If it can be accepted as stdlib module I will make it self-contained and PR it into stdlib.
case [1,2,3,4]:
of [_]: fail()
of [_, 3, _]: fail()
of [_, 2, 3, _]:
discard
Support for regular elif
branches in case
stmts
assertEq "default fallback", case (a: 22, b: 90):
of (_, b: 91): "900999"
elif "some other" == "check": "rly?"
elif true: "default fallback"
else: raiseAssert("#[ not possible ! ]#")
There are some things that I still have on todo list such as
*<patt>
) for lists*(<patt>)
in non-trailing format - [$a, *_, $b]
to match first and last element in sequence.[]
pattern requires len
to be equal - in a lot of situations it is necessary to just check first couple of items. Can be implemented using *_
with special treatment for this pattern (e.g. just don't check anything) but I would like to have something closer to [ , ] ...
with trailing ellipsis. But that is just preference mostly.nimtrs
where I have distinction between $scalarVariable
that can only be assigned once and @listVariable
which will store all appended elements.NimNode
is a most common use case, so it might be a good idea to adapt some solutions from rust.More programming languages using Pattern Matching:
@haxscramper cool! but please take a look at impl-s of patty/ast-pattern-matching/gara . They do some of those, so this might be cool. Also please keep in mind the goal of zero overhead : expanded code should be usually similar to hand-written condition/extraction(not always happening in those libs tho)
Awesome!
@haxscramper thanks and Also: can you write down a bit more detailed test spec/proposal here or in your own rfc : that was one of the goals of the rfc in my mind, to discuss a more official dsl
@alehander92
After some thinking I came to conclusion it is not necessary to specifically annotate captured variables with prefix since it really differs from any other nim dsl in stdlib. I started with $capturedVar
but now I think it is not really necessary. Type of matched variable can be inferred based on pattern context - e.g. variable is assigned (=
) if used in regular pattern and appended (.add
) if it is inside of repetition *(<pattern>)
although I've done this in nimtrs and current version of matching I think it is not that good of a solution in general.
Capture and submatch can be done using infix is
, equality comparison (any other prefix?) will be rewritten in infix notation, e.g. fld: == SomeVal
to match field against value will be rewritten into ... .fld == SomeVal
. Capture and match can be written as fld: captured == SomeVal
for infix operator checks (in
, ==
) or as fld: capt is SomePattern( ... )
Although I think this is a really slippery slope because there will be a lot of different behaviors of the match depending on written expression (e.g. == Val
vs var == Val
vs var is Patt
) so it is necessary to be really careful here. As I already mentioned it is really easy to just sit for a while and come up with dozens of possible nice things that can be added (regex pattern extraction, custom unpackers, predicatess, regex-like pattern matching for sequences (like in nimtrs) and so on).
One thing I think that is really necessary is to make it look closer to pattern matching in functional language and not enum case-of v2.0. Default nim convention for enum naming is to have abbrValueName
and ObjectName
. I think it should be possible to omit the abbr
prefix. Eg. I want to write IfStmt
instead of nnkIfStmt
etc.
Create value to compare against
macro e(body: untyped): untyped =
case body[0]:
of ForStmt([$ident, _, $expr]):
quote do:
9
of ForStmt([$ident, Infix([%ident(".."), $rbegin, $rend]),
$body]):
quote do:
`rbegin` + `rend`
else:
quote do:
90
let a = e:
for i in 10 .. 12:
echo i
assertEq a, 22
I would prefer to write `expr`
instead of %expr
but right now it seems it is not possible to add arbitrary expressions inside of quoted part. Syntax is just taken from nimtrs
. Since it is not possible to differentiate newValue()
from KindOfCaseObject()
I decided to just prefix it with %
for now.
Iflet macro
macro ifLet2(head: untyped, body: untyped): untyped =
case head[0]:
of Asgn([$lhs is Ident(), $rhs]):
quote do:
let expr = `rhs`
if expr.isSome():
let `lhs` = expr.get()
`body`
else:
head[0].assertNodeKind({nnkAsgn})
head[0][0].assertNodeKind({nnkIdent})
head[0].raiseCodeError("Expected assgn expression")
ifLet2 (nice = some(69)):
echo nice
ifLet2 (`nice` = some(69)):
echo nice
It uses some of my own node kind assertions, but this can be removed when design is finalized. Although due to complexity of pattern matching I think having something like this to show errors would be nice. But that is a topic for another RFC/PR.
Unexpected node kind. Expected one of {nnkIdent} but found nnkAccQuoted
112 echo nice
113
116:12 ifLet2 (`nice` = some(69)):
^~~~~~
|
nnkAccQuoted
if/else
or using nested pattern matches. Some things are used often enough (check if in set or equal to some specific value) so they should be supported out of the box.make it easily swapable for other pattern matching lib
If case statement macro will be moved to 'stable' part of the language (e.g. no longer hidden behind experimental switch) this can be done quite easily I think.
Patty
I don't think matching non-tuple types by position only is a good idea as it makes matches really fragile in case new fields are added (which is much more likely to happen than with regular anonymous tuple).
ast-pattern-matching
Supports only nim nodes
@alehander92 just addition to whatever I wrote above to RFC comment for pattern matching - I think it is better to just take as much potential ideas for syntax and features and then sort through them to determine if this is really necessary. But what I'm really certain about several things
case
The only thing I really care about is supporting if conditions with the pattern like so:
case n.kind
of foo and n.len == 4:
actionA
else:
actionB
This might require a language extension as a macro can only support it well via code duplication.
@Araq, This is already supported thanks to UFCS - I just generate access path to the field in object, without differentiating between fields and function calls
macro e(body: untyped): untyped =
expandMacros:
case body:
of Bracket(len: in {3 .. 6}):
newLit(expr.toStrLit().strVal() & " matched")
else:
newLit("not matched")
echo e([2,3,4])
echo e([3, 4])
Generated code
block:
let expr = body
if kind(expr) == nnkBracket and contains({3..6}, len(expr)): newLit(
strVal(toStrLit(expr)) & " matched")
else:
newLit("not matched")
@haxscramper thanks!
Pattern matching should be an extensio of case
sounds good
Unification is overkill in most cases
iirc it seemed not hard to implement, but this is not a top goal indeed
Special syntax for variable captures should be carefully considered - is it really necessary or will it only make code less readable
makes sense: I'll look at the bigger comment later
Omitting abbreviated enum name when matching kind is a huge plus for readability.
yep!
please keep in mind that probably nim nodes should work in a similar way to other variant types(even if nim nodes are defined as a more special type): we don't want to overspecialize for one case !
Also, exhaustiveness checking would be useful even if not for all possible situations :+1:
@alehander92
There is zero specific distinction for Nim node in a way I currently implement it - I have a little hack that just takes .kind
type implementation and determines correct prefix that should be added if necessary.
Exhaustiveness check is possible in theory but only for simpler patterns and it would make things more complicated as it would be necessary to either write custom checker or somehow lift kind
fields and do more sophisticated rewrite to take advantage of case
checks. Right now this is a simple rewrite to series of if branches with no specific checks.
iirc exhaustiveness checking isn't really that hard . Also we want to take advantage of generation of case if possible: pattern matching should be zero overhead. Of course those are not top priorities, so probably they don't need to be implemented now
basically for compound patterns you need to compose the simpler checks and if you already have those: this might not be too hard, but this is just old brainstorming, so I might be very wrong about it
tl;dr - maybe I'm missing something, but to me it seems like extremely difficult task.
Whether or not it is possible to check for exhaustiveness is a question that should be delayed until after we have finalized set of supported features. For example set checks in {3..5}
would require answering questions like "does this level of pattern tries all values or something is not matched yet". Things like ("hello", _)
, [2 | 3, _]
will also make it necessary to handle arrays, tuples etc. Due to how matching is done for function (e.g. example with len
) it will be even more complicated.
disclaimer: don't really look hard into this: maybe you're right and it's not so useful for Nim !
hm, I might not understand all the details .. but this means that it might be good to design the dsl in a way that it makes it a bit more prone to exhaustiveness checking.
but what's the problem with set / array checks? we would have just several data structures to check for, I guess
and yeah, as it's not only about the shape, the analysis can always be a bit more conservative, but a useful thing would be to just autogenerate a case which is not being handled as a warning: this is out of scope tho, so I am just discussing ideas
More todo
$a
for variable captures and replace it with just a
for variable, add non-trailling pattern matches.{2}
means len(set) == 1 and contains(set, 2)
, {2, *@others}
means len(set) >= 1 and contains(set, 2)
followed by for val in set: if val != 2: other.incl val
Replacing special prefixes with keywords?
any
instead of .._
any @var, until Patt()
after variadic instead of *@var, Patt()
any @var, incl Patt()
instead of **@var, Patt()
# With special keyworkds like `add`, `until`, `incl`
[any @leading, until @middle is "d", any @trailing]
[any, Patt(), any]
# With prefix annotations like `*`
[*@leading, @middle is "d", *@trailing]
[.._, Patt(), .._]
Variadic match would use any
, search-until is until
or incl
. For optional match use opt
. Capture wll be in form <kwd> @<var>
or <kwd> @<var> <op> <rhs>
(@middle is "d"
for example)
And just in general would you rather prefer DSL to introduce keywords or work with special symbols like ?@
, *@
, .._
?
We will still have opt @var ? default
and other things, but in general it would be <kwd> @<var>
or <kwd> @<var> <infix> <rhs>
And <fld>: <prefix> <rhs>
or <fld> @<var> <infix> <rhs>
for field/kv-pair captures
(in {1, 2, 3})
(== 123)
(fld: in {2, 3, 4})
(fld: is Patt())
(fld: @captured is Patt())
case (true, false):
of (true, true) | (false, false): 3
else: 2
Matching of fields is performed using (fld: <expr>)
for both objects and tuples.
To match case object you can either use Kind( ... )
to match only instances of single kind or (kind: in <set of values>)
to test against multiple possible values. It is not necessary to add enum prefix - e.g. to match nnkIfStmt
you can write IfStmt( ... )
.
Object matches can contains either fld: ...
matches or [elems]
for cases when you need to check items accessible via []
operator.
ForStmt([@ident, _, @expr])
..
as expression: ForStmt([@ident, Infix([== ident(".."), @rbegin, @rend]), @body])
and capture loop variable, @rbegin .. @rend
and @body
Matching optionas ((fld: Option[T])
)
fld: @opt ? "default"
Thanks to UFCS I can treat functions as regular fields, making it possible to have patterns like
macro e(body: untyped): untyped =
case body:
of Bracket([Bracket(len: in {1 .. 3})]):
newLit("Nested bracket !")
of Bracket(len: in {3 .. 6}):
newLit(expr.toStrLit().strVal() & " matched")
else:
newLit("not matched")
echo e([2,3,4])
echo e([[1, 3, 4]])
echo e([3, 4])
[2, 3, 4] matched
Nested bracket !
not matched
Sequence elements can be either matched exactly (for example [_]
to match sequence with exactly one element) or using variadic expressions.
Right now I have two ideas for expression syntax - first one relies on use of symbols like *
and the other one introduces more keywords. Example of uses
[any @elems]
or [*@elems]
[any @leading, until "d"]
or [*@leading, "d"]
[any @leading, with "d"]
or [**@leading, "d"]
[_, _, any]
or [_, _, .._]
[_, _, opt @trail]
or [_, _, ?@trail]
[_, _, opt @trail ? "default"]
or [_, _, "default" ?@trail]
[add @leading, until @middle is "sep", add @trailing]
IfStmt
: IfStmt([add ElseIf([@cond, _])])
I'm more in favor of keyword-heavy syntax because **@
, *@
, .._
is looks worse than any , until
, any, with
. Trailing ellipsis might be added as alternative to standalone any
2
and 3
is in set: {2, 3}
2
: {2, .._}
9
and capture all other elements: {9, @other}
"key"
: {"key": _}
{"key": @val}
case %{"hello" : %"world"}:
of {"999": _}: "nice"
of {"hello": _}: "000"
else: "discard"
of (true, true) | (false, false): 3
@haxscramper Just to clarify: Is this different from of (true, true), (false, false):
, and if so, would the comma version still work?
@hlaaftana No it is not different and comma version should work the same way it works in regular case.
So
of (true, false), (false, true):
of (true, false) | (false, true):
Should be identical
Another use case for pattern matching is input data validation. One of the main use cases for pattern matching is macro implementation.
Use isnot
as optional clause to execute when something does not match. It is not really useful for pattern matching with multiple branches (because not matched branch does not signal that input data is invalid), but should be really handy when doing basically anything related to external input validation (json, nim node etc.)
I withdraw my previous statement about support for custom predicates - it is not really that hard to implement, so there is no reason to avoid it, but it will be very useful for input validation.
@var.matchesCond()
- check if element matches condition and bind it to var
_.matchesCond()
- only check predicateI think pattern matching (data destructuring) should at least have all features of data synthesis (object construction, json input, array comprehension etc), so support for validation is a good thing to have.
{
# Match pattern, otherwise execute `doError`
"key": @a is ("2" | "3") isnot doError(),
# execute regular code
"key": @a is ("2" | "3") isnot (
echo "Expected `2` or `3`"
),
# Do nothing on fail
"key": @a is ("2" | "3"),
# Execute callback if match failed
"key": "2" | "3" isnot doError(),
# Check for match
"key": "2" | "3",
"key": _.isString() isnot (echo "Expected string for key"),
"key": @a.isInt() isnot (echo "expected integer")
}
(
fld: ForStmt([
# First element must be an `nnkIdent("..")`
== ident("..") isnot (
echo "Expected infix `..` on ", it.lineInfoObj()
),
# Second element should have kind `IntLit()` and will be bound
# to injected vaeriable `a`
@a is IntLit() isnot (
echo "Expected integer literal `..`", it.lineInfoObj()
),
@a is (
IntLit() or StrLit()
) isnot (
echo "Expected either integer literal or string but found " &
$it.lineInfoObj()
)
])
)
https://github.com/Originate/lodash-match-pattern
is Patt()
: check if matches patternin Set
: is in set== Val
: Is equal to valueadd @var
: only in sequence patterns. Add all elements to var
until input end reached or next pattern matchedadd, until Expr
: match all until frist Expr
add, incl Expr
: match all including all Expr
opt @a
: match optional element and bind it to a
opt @a ? "default"
: either match element to a
or set a
to "default"To me it looks good enough, so most likely I will be implementing this or something really simiar to it.
Introduce stdlib module matching
Make caseStmtMacros
a non-experimental feature
Overload for match
macro with following prototype:
macro match*(
n: tuple | object | ref object | seq | array | set
): untyped
New matches
macro with following prottype:
macro matches*(
n: tuple | object | ref object | seq | array | set,
expr: untyped,
makeDiagnostics: bool = false
): untyped
assertMatch
template
template assertMatch*(input: typed, pattern: untyped): untyped
MatchDiagnostics
objec
type
MatchDiagnostics = object
matchFails*: seq[tuple[path: Path, diagn: string]] ## Position of
## the the first match failure. If multiple alternatives are tried
## (`1 | 2`) may contain more than one path. `diagn` is a string
## describing match failure
matches
is a version of match
that works for single checks, evaluating either as true or false if makeDiangostics
is false. If makeDiagnostics
is true tuple[matchOk: bool, matchDiagn: MatchDiagnostics]
is returned.
assertMatch
is a wrapper on top of match with semantics similar to
template assertMatch(input: typed, pattern: untyped): untyped =
let (matchRes, diagnostics) = matches(input, pattern, makeDiagnostics = true)
if not matchRes:
raise newMatchError(diagnostics)
All variables injected by matches
are available in the scope, so you can write assertMatch(<user input>, [all @elems is JString()]
and just use elems
down the line, knowing this is a array of json strings.
Similar to flexible serialization RFC #247, nimtrs several kinds of collections are supported which are differentiated based on pattern match syntax.
[Patt1(), Patt2(), ..]
. Must have len(): int
and [int]: T
defined. 1(Patt1(), Patt2(), ..)
. 2{Key: Patt()}
. Must have [Key]: T
defined. Key
is not a pattern - search for whole collection won't be performed.{Val1, Val2, .._}
. Must have contains
defined. If variable is captured then Val1
must be comparable and collection should also implement items
and incl
.(field: Val)
. Case objects are matched using Kind(field: Val)
or, if you want to check agains multiple values for kind field (kind: in {set, of, values})
Where expr
is a result of evaluation for case head. If case head is an identifier it will be used as-is in pat subsitution. Otherwise let expr = <case-head>
will be injected in scope. expr
is not gensymed, making it possible to access it when necessary.
(fld: "3")
: Match field fld
against "0"
. Generated access is expr.fld == "3"
.["2"]
: Match first element of expression agains patt. Generate acess expr[0] == "2"
("2")
: Same as sequence{"key": "val"}
: First check "key" in expr
and then expr["key"] == "val"
. No exception on missing keys, just fail match.It is possible to have mixed assess for objects. Mixed object access via (gg: _, [], {})
creates the same code for checking. E.g ([_])
is the same as [_]
, ({"key": "val"})
is is identical to just {"key": "val"}
. You can also call functions and check their values (like (len: _(it < 10))
or (len: in {0 .. 10})
) to check for sequence length.
is
is considered final comparison and just pasted as-is into generated pattern match code. E.g. fld: in {2,3,4}
will generate expr.fld in {2,3,4}
(fld: is Patt())
- check if expr.fld
matches pattern Patt()
(fld: _.matchesPredicate())
- if call to matchesPredicate(expr.fld)
evaluates to true.Notation: <expr>
refers to any possible combination of checks. For example
fld: in {1,2,3}
- <expr>
is in {1,2,3}
[_]
- <expr>
is _
fld: is Patt()
- <expr>
is is Patt()
(fld: 12)
If rhs for key-value pair is integer, string or identifier then ==
comparison will be generated.(fld: == ident("33"))
if rhs is a prefix of ==
then ==
will be generated. Any for of prefix operator will be converted to expr.fld <op> <rhs>
. 3(fld: in {1, 3, 3})
or (fld: in Anything)
creates fld.expr in Anything
. Either in
or notin
can be used.Match can be bound to new varaible. All variable declarations happen via @varname
syntax.
(fld: @varname)
(fld: @varname <operator> Value)
first perform check using <operator>
and then add Value
to @varname
(fld: @hello is ("2" | "3"))
fld: @a.matchPredicate()
fld: @a(it mod 2 == 0)
. If expression has no type it is considered true
.Bind order: if check evaluates to true variable is bound immediately, making it possible to use in other checks. [@head, any @tail != head]
is a valid pattern. First match head
and then any number of @tail
elements. Can use any _(if it != head: tail.add it)
and declare tail
externally.
Variable is never rebound. After it is bound, then it will have the value of first binding.
Pattern | Ijected variables |
---|---|
[@a] |
var a: typeof(expr[0]) |
{"key": @val} |
var val: typeof(expr["key"]) |
[all @a] |
var a: seq[typeof(expr[0])] |
[opt @val] |
var a: Option[typeof(expr[0])] |
[opt @val or default] |
var a: typeof(expr[0]) |
(fld: @val) |
var val: typeof(expr.fld) |
Input sequence: [1,2,3,4,5,6,5,6]
Pattern | Result | Comment |
---|---|---|
[_] |
Fail | Input sequence size mismatch |
[.._] |
Ok | |
[@a] |
Fail | Input sequence size mismatch |
[@a, .._] |
Ok, a = 1 |
|
[any @a, .._] |
Error | 4 |
[any @a.(it < 10)] |
Ok, a = [1..6] |
Capture all elements that match |
[until @a == 6, .._] |
Ok | All until first ocurrence of 6 |
[all @a == 6, .._] |
Ok a = [] |
All leading 6 |
[any @a.(it > 100)] |
Fail | No elements > 100 |
[none @a.(it in {6 .. 10})] |
Fail | There is an element == 6 |
until
: non-greedy. Match everything until <expr>
until <expr>
: match all until frist element that matches Exprall
: greedy. Match everything that matches <expr>
all <expr>
: all elements should match Exprall @val is <expr>
: capture all elements in @val
if <expr>
is true for every one of them.opt
: Single element match
opt @a
: match optional element and bind it to aopt @a or "default"
: either match element to a or set a to "default"any
: greedy. Consume all sequence elements until the end and succed only if any element has matched.
any @val is "d"
: capture all element that match is "d"
none
: greedy. Consume all sequence elements until the end and succed only if any element has matched.Greedy patterns match until the end of a sequence and cannot be followed by anything else.
For sequence to match is must either be completely matched by all subpatterns or have trailing .._
in pattern. 5.
Sequence | Pattern | Match result |
---|---|---|
[1,2,3] |
[1,2] |
Fail |
[1, .._] |
Ok | |
[1,2,_] |
Ok |
More use examples
[all @elems]
[until @a is "d"]
[all @leading is "d"]
[_, _, .._]
. 6[_, _, opt @trail]
[_, _, opt @trail or "default"]
[until @leading is "sep", @middle is "sep", all @trailing]
IfStmt([all ElseIf([@cond, _]), .._])
Input tuple: (1, 2, "fa")
Pattern | Result | Comment |
---|---|---|
(_, _, _) |
Ok | Match all |
(@a, @a, _) |
Fail | 7 |
(@a is (1 | 2), @a, _) |
Error | 8 |
(1, 1 | 2, _) |
Ok |
Input AST
ForStmt
Ident "i"
Infix
Ident ".."
IntLit 1
IntLit 10
StmtList
Command
Ident "echo"
IntLit 12
ForStmt([== ident("i"), .._])
Only for loops with i
as variableForStmt([@a is Ident(), .._])
Capture for loop variableForStmt([@a.isTuple(), .._])
for loops in which first subnode satisfies predicate isTuple()
. Bind match to a
ForStmt([_, _, (len: in {1 .. 10})])
between one to ten statements in the for loop bodyInput json string
{"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{"value": "New", "onclick": "CreateNewDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]
}
}}
Get input ["menu"]["file"]
from node and
case inj:
of {"menu" : {"file": @file is JString()}}:
# ...
else:
raiseAssert("Expected [menu][file] as string, but found " & $inj)
Some(@x)
and None()
is a special case that will be rewritten into (isSome: true, get: @x)
and (isNone: true)
respectively. This is made to allow better integration with optional types. 9.
1 it might be possible to generate
input sequence if only items
is defined
2 you can, technically match arrays using the same syntax since generated access paths are the same
3 even though it opens possibilities for
various hacks it is advisable to use only comparison operators
like ==
, or =~
4 consecutiveGreedy
5 Trailing .._
does not generate exchaustive checks for sequence match, so you can
use it to do things like pattern match next tokens in lexer for
example. This is one of the main use cases that I have in min for
sequence matching - you only need to have len
and []
, so it should
be pretty easy to adapt anything basically
6 actually you can also use (len: _(it >= 2))
- it will
generate call to expr.len
which is perfectly valid. It is not
strictly necessary to only match sequences with [...]
and objects
with (fld: )
7 First match for a
has value 1
but second one should be equal to 2
8 Binding of variables to pattern that contains alternative is only supported if varible occurs once in the expression. Reason: there is no backtracking for match. First tuple element matches to 1
, but then fails on second element. To make correct match we would have to backtrack, rebind @a
to two and then try binding again.
9 if your
enum contains enSome
and enNone
you can match for elements using
enNone(fld: @val)
or (kind: enNone, fld: @val)
. Option is much
more common case so it takes priority and special-cased
@alehander92 initial implementation as described in the draft is mostly complete (sets and predicate calls (@a.matches
) are not finished & I have some other ideas that I need to test). I will be adding more unit test and searching for potential flaws in design/edge cases + improving documentation.
You can try current version by installing hmisc@#head
- right now implementation depends on some other modules from the library; I will of course clean this up before PRing it, but now I just used it to save time. Minimal standalone use example
In addition to syntax described above I would also like to introduce two operators
template `:=`*(lhs, rhs: untyped): untyped = assertMatch(rhs, lhs)
template `?=`*(lhs, rhs: untyped): untyped = matches(rhs, lhs)
Which would allow sequence and nested tuple unpacking, similar to #194, unpack and definesugar module. rust-like iflet
macro (even including some features from RFC) - all of that, basically for free (two-line implementation)
(@a, (@b, @c), @d) := (1, (2, 3), 4)
[all @head] := [1,2,3]
if (Some(@x) ?= some("hello")) and
(Some(@y) ?= some("world")):
assertEq x, "hello"
assertEq y, "world"
else:
discard
Implementation is completed (feature-wise) in nim-lang/fusion#33. Some additional notes in this comment, but everything else is complete (as far as I'm concerned)
This bug should be closed, pattern matching already exists and works.
I'm not sure if it's already too late to change stuff, but here are some opinions (<pattern>
means an arbitrary pattern):
Nested matching (and in general, the different kinds of patterns, like sequences, objects etc.) seems to have inconsistent syntax, for example to match for a nested pattern in objects, you need to use (EDIT: seems like this is not the case and you can just leave out the (field: is <pattern>)
, while in sequences, you can nest a pattern directly ([<pattern>]
).is
, so I don't see the point of having it)
Using operators (@a == 42
), predicates (@a.matchesPredicate()
) and trailing expressions with it
(@a(it mod 2 == 0)
) don't really fit in, it feels more like a tacked on, overloaded DSL. Instead, I think something like trailing guards (putting if <condition>
or and condition
after the pattern) and as-patterns (a pattern where the value is bound to a name) (EDIT: this already exists with is
) would be better.
Some
should allow matching for Some(<pattern>)
(but I assume something like that is planned). I also don't really like that Option
is special cased (it's in the stdlib, so it's not that bad, but still), with custom unpackers (see next point), that should be possible to solve.
An option to easily create new patterns would be nice (something like unpackers in gara or the unapply
method in Scala). This should also allow to implement matching on Option
s in a library, without special casing it, by creating custom patterns for Some
and None
.
Don't get me wrong, I love pattern matching and I appreciate the effort you make, but I think we can do better (especially since this is in fusion
now and will likely become the recommended pattern matching implementation for Nim).
It's not too late but further progress needs my upcoming "let/var inside expressions" RFC. IMHO.
I think this RFC can be closed, because there nothing else to add to it. "let expressions" should be discussed in a separate RFC (when it comes out), and particular details of pattern matching implementation should be discussed in fusion
.
Is it possible to make capturing matched values to variables not so ugly?
for example
const b = 2
case some_seq
of [a, (b)]: ... # b is constant and = 2
of [a, b]: ... # b = some_seq[1]
else: ...
instead of
const b = 2
case some_seq
of [@a, b]: ... # b is constant and = 2
of [@a, @b]: ... # b = some_seq[1]
else: ...
Current implementation won't be changed, but when/if we get let expressions this part of the syntax could be revised. Drop the requirements for @
prefix and instead require use ==myVarToCompare
when equality comparison is necessary.
A bit late to the party. It's strange that among other langauges the Elixir/Erlang was not mentioned. As it has one of the most powerfull and elegant pattern matching capabilities https://elixir-lang.org/getting-started/pattern-matching.html and https://elixir-lang.org/getting-started/case-cond-and-if.html
Prolog also wasn't mentioned, and it seems like elixir adopted a large portion of the list operations from its [head | tail]
syntax. Variable pinning via ^
is probably better than current @var
syntax. Patterns for function arguments is an interesting idea (and it can be implemented with fusion/matching
matching easily (someone already asked about this, so I have a working example)):
matchCall:
proc pattern(a: NimNode{Infix[Ident(strVal: == "+"), .._]}): NimNode =
echo "Matches plus"
result = newEmptyNode()
proc pattern(a: NimNode{Infix[Ident(strVal: == "-"), .._]}): NimNode =
echo "Matches minus"
result = newEmptyNode()
macro usesPattern(body: untyped): untyped =
return pattern(body)
usesPattern(12 + 2)
usesPattern(2 - 2)
Pattern matching for Nim
Pattern matching can be a good addition to the stdlib/fusion. It seems it becomes more mainstream:
This is an early version of a RFC: we need to decide what to do before having a more complete document.
We already have several pattern matching libraries :
A first question is, what do we want:
Motivation
TODO
Many more examples in gara, ast_pattern_matching and patty's docs: TODO when a general approach is decided on
Constraints
Research
It would be useful to research a bit more Rust, Haskell, OCaml, Python, Ruby, Scala, Erlang, Prolog, Elixir
We can also compare several possible designs like the Python folks did in their preparation for the feature.
Next steps
Future directions
Not sure
Disclaimer
I am the author of one of the libs gara, so I might be biased.
pinging @krux02 and @andreaferretti which probably have something to say on the topic