Open omentic opened 1 year ago
Is it really necessary to change the language syntax for this? Maybe it would be better to standardize macros that implement the same thing on the existing syntax?
The current variant objects are really quite unconvinient, and it would be nice to have syntactic sugar over them to create algebraic types.
🤔
That doesn't mean much beyond the fact that people avoid dependencies for various reasons.
So what does this RFC accomplish that e.g. fusion/matching does not?
You don't need to teach me about ML and Rust and Racket and Foozbaz, I know. You need to tell me why you found the existing solutions based on macros lacking.
fact that people avoid dependencies
We agree.
So what does this RFC accomplish that e.g. fusion/matching does not?
The union
type is new. Something similar exists as a macro in beef/fungus
, but the implementation differs from how it would be implemented in the compiler for performance and semantic reasons.
fusion/matching
also provides comprehensive structural pattern matching, there is no feature discrepancy between the two. This RFC is an attempt to integrate it into the language more: by means of reusing existing case syntax and pulling things that would require specific new syntax and not help with exhaustiveness into where
clauses (this is described in more detail in the first paragraph of the notes section above).
After work, I'll write up a detailed comparison between the two because I think it reveals some interesting design decisions. But a place to look for now, if interested, is the difference between this RFC's use of where
guards vs. fusion/matching
's use of until
, all
, some
, @
. Revisiting fusion also reminds me I forgot to describe how tables are matched. It's exactly as you would expect and consistent with tuples etc, I'll add this later too.
Also, a little note on other languages: despite Racket's match statement being the most powerful of the bunch and Python being the closest language to Nim syntactically, I took the least inspiration from them. I find Python's match to be grotesquely complex for relatively little benefit, and think where
clauses a la Swift to be a much nicer solution. This proposal is also purposely less powerful than Racket: Racket's ability to match on repetitions of arbitrary nested list expressions requires special syntax and is only particularly helpful to a list-oriented language. Nonetheless, it can be done with where
clauses if needed (this is a trend...).
Very nice!
ADT's are to Nim case
types what scalars are to vectors - you can express everything an int
can express using a seq[int]
but doing so is cumbersome in code and misses out on many of the assumptions you can make and therefore on the help that the language and compiler can bring.
It's true that the proposed type sort of is a special case of a case object with a specific form of enum, exactly one discriminator and no other members - but that special case is useful in many constructs which logically express "one and only one of these things" - the rest is followup on top of that concept.
Rather than asking "can macros solve this?", the better question is "can we create better macros with this feature present", and there the answer is undoubtedly yes, because it's such a fundamental construct that appears over and over in many code patterns.
Many things become more simple when you can these assumptions about a type: as mentioned already: resetting or overwriting during serialization, exhaustive case statements, etc.
Here's a simple example which relies on the knowledge that a type takes on exactly one shape: because the type is guaranteed to be exactly one of the enumerated options, we can reason about that option without specifying the full type but still express a meaningful relationship to it. In Result[T, E]
this is often useful because you want to be able to express a type that matches any Result[T, X]
where X is known but T
is not to express the idea that you are working with "the error side of type X of any Result" without knowing the exact type of the result (when returning an error you don't care about the "value" side - this is supplied by the context in which the error is being returned, not the error return itself) - such a type can safely be converted to a "full" Result
because it is known per language semantics that Result itself takes one and only one of the given options.
I generally like the idea and I was one of the complainers about the PM situation. But as I said in Discord, if this ever gets accepted, who is going to take ownership and implement it. Especially PM? I really hope that the roadmap is unaltered and IC still comes right after 2.0.
So what does this RFC accomplish that e.g. fusion/matching does not?
At the very least, it will be a workaround for this :P https://github.com/nim-lang/Nim/issues/20435
How would the proposed syntax distinguish between a single field that is a tuple and multiple fields? Using tuples for the latter is not nice imo. I'd prefer something closer to the syntax for constructing/matching the variants (which is also how Rust and Swift do it):
type Shape = union
Point()
Line(int)
Square(side: int)
Rectangle(x, y: int)
Triangle(TriangleKind)
In my unwritten proposal I used this syntax:
type
Tree = ref case kind # also possible without the `ref`
of StringLeaf:
str: string
of IntLeaf:
x: int
of Parent:
kids: seq[Tree]
No new keyword required and the syntax is optimized for pasting into a case
statement.
@konsumlamm Currently:
type Shape = union
Rectangle: tuple[x, y: int]
type Shape = union
Rectangle: tuple[tuple[x, y: int]]
I picked this syntax to be as close to Nim's existing semantics as possible while still providing the full power of ADTs. I like your syntax more aesthetically, but IMO it doesn't fit in with tuple / object types as well. Also note that the inner type of Rectangle in both of our examples is tuple[x, y: int]
: even if it's not explicitly specified in yours.
@Araq I assume the unwritten proposal was mostly for syntactic sugar over the existing variant object syntax? A benefit of the union
proposal and ADTs as a whole is that they let you abstract over arbitrary types, not just named fields of objects. (eg. you don't really need the labels str
, x
, kids
in your Tree implementation, just the types)
eg. you don't really need the labels str, x, kids in your Tree implementation, just the types
That's a good point but easily fixed:
type
Tree = ref case # also possible without the `ref`
of StringLeaf:
string
of IntLeaf:
int
of Parent:
seq[Tree]
The advantages remain: Natural Nim syntax, no new keyword, easily turned into a case
statement.
I disagree that that's natural Nim syntax, I find it subjectively very ugly, the lack of a field following case
inconsistent, and the raw types in branches weird. I don't think being easily turned into a case
statement is a particular advantage, and it really doesn't follow type declaration syntax or semantics (I would be quite thrown off learning Nim for the first time).
I'm not particularly attached to the union
keyword but I think introducing a new keyword to make it a first-class type on the level of object
/ tuple
/ enum
would be worth the tradeoff from a language design perspective.
I suppose we could also do this leaving eg. nested
unambiguously available for use elsewhere, though I don't like it.
type Tree = ref nested enum # oops need indirection
StringLeaf: string
IntLeaf: int
Parent: seq[Tree]
I care about keeping the of
branches, if you want to change case
to sumtype
or union
I don't mind it. Though both sumtype
and union
are ugly. Maybe enum
?
type
Tree = ref enum # also possible without the `ref`
of StringLeaf:
string
of IntLeaf:
int
of Parent:
seq[Tree]
type Tree = nested enum
StringLeaf: string
IntLeaf: int
Parent: seq[Tree]
is wrong for two reasons: You need the ref
indirection in order to nest it. And StringLeaf: string
is not a field of type string
, it is a branch that supports an unnamed string
field.
That kinda goes against the existing semantics of enum being explicitly an enumeration though, and also conflicts a bit semantically (not in syntax) with the existing feature that lets you set labels of an enum to values. My original approach was to overload the enum
type but I found it conflicted too much with existing expectations of enum
.
Now that I think about it more, I really don't like the idea that you can copy and paste the of
branches into a case
statement actually: most of the times when I use ADTs in other languages I don't want to handle all the cases, and/or want to handle particular cases differently, via structural pattern matching or guards or what not. The extended examples I picked for this RFC are a bit contrived in that they are for interpreting a language where you have to handle every case without restrictions. I think that'd kind of lead people in the wrong direction wrt. pattern matching.
re: ref
, oops. ref
indirection is actually supported and necessary in this RFC btw, there's some examples in the collapsed Code Examples block.
And StringLeaf: string is not a field of type string, it is a branch that supports an unnamed string field.
Yeah, even though they're comparable to objects and the syntax looks similar, they're not. This is one of the few parts that differs from existing semantics. I personally think it's fine - concepts
do the same thing and introduce different-than-objects but still very consistent syntax within their type declaration.
I think that'd kind of lead people in the wrong direction wrt. pattern matching.
Actually, this is how you ensure correctness, having thought about it, you handle every case explicitly. It's the "oh and in all other cases (else) do ..." that produces bugs. In the Nim compiler and everywhere else.
Maybe case object
?
If a non-macro solution for ADTs/Pattern Matching is finally being considered, then my vote goes to a rust/haxe like syntax only. A lot of the syntaxes suggested above are confusing / error prone / ugly just like the existing object variant syntax (imo).
ML style ADTs would be a great idea and I won't quibble about the syntax but I worry that too much conditional freedom on each match arm eg. x where typeof(x) is ...
will cripple exhaustiveness checking which is biggest unseen benefit of ADT's. I would be happy with much less pattern matching expressivity in return for the compiler flagging missing patterns. IMO previous efforts like fusion/matching
and patty
do too much.
Have you looked D's sumtype? It is inspired by ADTs but leans heavily into static introspection and would be a nicer fit for Nim vs. trying to copy Haskell/OCaml.
Have you looked D's sumtype? It is inspired by ADTs but leans heavily into static introspection and would be a nicer fit for Nim vs. trying to copy Haskell/OCaml.
D's sumtype
doesn't support nested patterns, which is half the point of pattern matching and would make a lot of things more cumbersome, so I don't think that's an approach we should follow.
D's
sumtype
doesn't support nested patterns, which is half the point of pattern matching and would make a lot of things more cumbersome.l, so I don't think that's an approach we should follow.
It's more cumbersome but when you can only destructure on one thing at a time static analysis is much easier, you can get better IDE support, possibly even exhaustiveness checking, and maybe even case splitting like Idris.
possibly even exhaustiveness checking
Nim's case
statement already does exhaustiveness checking and I have no intention of moving to something that lacks it.
Nim's case
statement exhaustiveness checking is pretty brittle:
type
AnEnum = enum
A
B
let e = A
case e
of A: echo "A"
else:
case e
of B: echo "B"
Here I get the error:
/tmp/testcase.nim(11, 3) Error: not all cases are covered; missing: {A}
It's not "brittle", it works as intended. I'm also not familiar with any mainstream language that offers exhaustiveness checking over nested case
/match
statements in the way that you outline.
Not mainstream, and not completely generally but with a ton of effort Haskell does. But the point is by making the pattern matching more "cumbersome" but also more analyzable you can get this but without most of the implementation complexity.
@deech Ah, I'll write up a little thing on this with more detail. But basically: I'm very aware of this, and designed it so that the where
statement is used for things that cannot be exhaustively checked.
@metagn I don't love case object
- I think that gives the impression that it is fundamentally an object, which it isn't. Maybe case enum
though? Though I do much prefer a new keyword.
Though I do much prefer a new keyword.
A new keyword would be nice if we could find a good one but so far I don't like any of the proposed new keywords. FWIW Swift reuses "enum" and Swift in general chooses nice names.
Skipping most of the syntactic details (union
vs case kind
vs case enum
vs sumtype
etc.) as they constitute about 5% of the actual problem in question (after deciding on placement of the library/language addition, specific list of semantics that would be available and so on)
Maybe it would be better to standardize macros that implement the same thing on the existing syntax?
Is it really necessary to change the language syntax for this?
case
statement macros that current let's try to standardize" solution overload existing language syntax needlessly.
That doesn't mean much beyond the fact that people avoid dependencies for various reasons.
True, which means that if there were to be any serious talk about having a "standard" solution for pattern matching, it would have to be in the standard library or in the language, otherwise why even bother, someone can write just another library.
At some point, people start having ideas that a particular feature would be very useful if it was readily accessible and try to promote its addition and more widespread use. Given the existing track record (writing library makes it half-dead on arrival) and belief that addition of such feature to the language would be a net benefit, it makes sense to suggest addition to the core language.
So what does this RFC accomplish that e.g. fusion/matching does not?
@thing
syntaxes as one of them)In the interest of shipping this feature, I would strongly suggest that this RFC focuses on the abstract data type definition which loosely can be described as a variant object with exactly one case
in it.
I would go on to leave pattern matching for a completely separate RFC and treat / discuss it orthogonally.
The reason for this is simply that the two affect completely different parts of the language - the former enriches the type system with a new construct while the latter focuses on syntactic convenience now made possible - you can imagine for example that we might decide on shipping the data types in the language and relegating matching to libraries / macros (in std lib or not doesn't really matter - as can be seen in any modern language with a good package manager, people have 0 problems with using dependencies in general as long as the barrier of entry is low - this is mostly a cultural problem local to languages that lack good dependency management integration).
I have updated the RFC: adding a section on matching sets + tables, and adding a section on an equivalent to Rust's if let
for easy unwrapping when you don't want to pattern match. Thanks @elcritch for the syntax suggestion.
@arnetheduck As far as implementation goes, I agree: the work will definitely largely separate into union types + unwrapping vs. pattern matching for structural types. But union
types are useless without some form of structural pattern matching: even if this isn't comprehensive to begin with, full pattern matching should probably be discussed here to avoid creating conflicts in the implementation in the future.
I'd be wary of special pattern matching syntax for tables. Afaik, currently, there is no builtin syntax for tables (the Table
type is not builtin, while set
and seq
are).
fact that people avoid dependencies
it would have to be in the standard library
Ive seen code " include
'ing by path " the pattern matching lib from the compiler tests/
folders. 🤣
But union types are useless without some form of structural pattern matching: even if this isn't comprehensive to begin with, full pattern matching should probably be discussed here to avoid creating conflicts in the implementation in the future.
I'd generally agree. Without pattern matching sum types would only provide convenient constructors which macros can provide. I'd still think it'd be great to have but would feel lacking. Really the implementation could ship in multiple parts, but having even a basic pattern matching syntax agreed upon would be useful. Maybe skip where
clause semantics for a part 2.
Importantly having pattern matching syntax in the language would prevent splintering and requiring folks to learn multiple different macro based versions. I already didn't want to mix Fungus and Patty in a project as they have different semantics.
Also I don't believe it's possible to do the if shape as Rectangle(x, y)
without compiler changes IIRC. ElegantBeef and I were trying some hacks, but it was late night.
BTW, here's another possible syntax:
type
Tree1 = ref enumeration
of StringLeaf:
str: string
of IntLeaf:
x: int
y: int
of Parent:
kids: seq[Tree]
Tree2 = ref enumeration
of StringLeaf: string
of IntLeaf: tuple[int, int]
of Parent: seq[Tree]
The downside is that it's slightly confusing at first why there'd be an enum
and enumeration
that'd be easy to explain: simple enum, complex enumeration. Personally I like having both a simple enum and a complex enumeration.
@konsumlamm
I'd be wary of special pattern matching syntax for tables.
It should be for contains()
and []
operator obviously, not for Table type.
Aka {6: x}
is it.contains(6)
and x = it[6]
are useless without some form of structural
I think my point is more that pattern matching should take the whole language into consideration and not only union types (ditto union types themselves). One can easily imagine pattern-matching over other things than union types, including value matching / conditionals / etc and treating them here in the same RFC not only risks blocking union types on bikeshedding of pattern matching but indeed risks not giving due attention to each of them in the context of the rest of the language.
I don't disagree that it's good to keep both things in mind during initial design, but union types are useful on their own (because of the expressivity they add to the type system) and pattern matching isn't limited to them.
I think my point is more that pattern matching should take the whole language into consideration and not only union types (ditto union types themselves).
If only we had capable language designers who could have anticipated that... ;-)
Good points. I wonder if there's a way to enable just a simple x as Some(y)
type syntax using an overloaded operator? With that and case macros one could easily build most (all?) of the pattern matching bits as a library. If that's possible as an operator? Maybe a tuple override.
Then it'd just be settling on a union / sum type / enumeration / maybe-an-adt syntax. That should be easy. ;)
Then we could get sample PRs to get the ball rolling.
If only we had capable language designers who could have anticipated that...
That just allows you to write macro that is syntactically overlapping with case statement ... so basically the link should point to the macro section actually
@j-james I think aside from comparison with hmatching on features you can also ask for feedback on forum/discord like what sucks?, What you don't like, what could be added, what you didn't understand how to use, what feels off -- just some quick reactions
Yes, that kind of feedback is very welcome (along with all other feedback)! I've posted the link to this a few times but have not explicitly asked people to go through and evaluate it yet, will do so in the morning.
Does a language implementation of something like https://github.com/alaviss/union/ or like Crystal's runtime union types (but not implicit or merged with union typeclasses) solve the problems listed in the issue?
type
A = object
x, y: int
Foo = union(int, A)
# under the hood an object variant, but structural and commutative
assert Foo is union(A, int)
proc `$`(f: Foo): string =
if f of int: # reuse object inheritance check syntax
result = $int(f)
elif f of A:
let f = A(f)
result = "A(" & $f.x & ", " & $f.y & ")"
# or as case statement
case f
of int: $f # automatic type narrowing? doesn't really make sense but not essential
of A: "A(" & $f.x & ", " & $f.y & ")"
# cases exhausted
var f: Foo
# A is convertible to Foo
f = A(x: 1, y: 2)
assert $f == "A(1, 2)"
# so is int
f = 456
assert $f == "456"
# recursion and distinct types
type
StringLeaf = distinct string
IntLeaf = distinct int
ParentTree = distinct seq[Tree]
Tree = ref union(StringLeaf, IntLeaf, ParentTree)
let tree = Tree(ParentTree("abc"))
(I expect recursion here to work because the following code works, but it might not be reliable)
This way each branch is tied to a type, construction and destructuring are much simpler to implement (and not ambiguous with existing case
statements like every pattern matching solution described above), you don't need to worry about field names or whatever, you can even do things like A of Foo
that allow for more introspection.
Is this sound, or is there maybe a problem it doesn't solve?
That just allows you to write macro that is syntactically overlapping with case statement ...
I know you don't appreciate the feature but I don't know why that is. It unifies the potential different pattern matching DSLs and is as beneficial as having a unified syntax for assignment and equality for generic code.
unifies the potential different pattern matching DSLs
It only unifies a single keyword, everything else is up to the macro author to write, so it can be hardly called 'unifying'.
Assignment and equality seem to have different syntax: = vs == and they don't allow the user to put on complex custom DSL in the body.
In order for the unification claim to hold true we must ensure that every DSL implemented with case statement follows the same patter and largely has the same semantics, which more or less implies some standardized approach for this, in which case it does not make any sense to aim for several different macros that are "unified" yet different somehow, as opposed to a single standard solution.
Let's assume I have a custom type with two fields that I want to be able to match against an array-like pattern:
type
Pair = object
x, y: int
var p: Pair
macro `case`(p: Pair; u: untyped) = ...
case p
of [x, _]:
Now if I happen to use something more complex than int
inside Pair
I can call the case macro for type T inside my case macro for type Pair
. That's the power of a unified interface.
Just like you can lift ==
for seq[T]
as long as T
has a ==
.
Abstract
Proper sum types and structural pattern matching for Nim by the introduction of
union
and extension ofcase
with structural matching and andwhere
clauses. Heavily inspired by Rust, Swift, Racket, Python, and others, and adapted to Nim's particular type system and syntax.Motivation
I write a lot of Nim code, and I write a lot of Rust code. I find both to be excellent languages with their own unique tradeoffs and specialties. But there is one thing in particular I greatly miss when going from Rust to Nim: and that is algebraic data types and pattern matching.
Algebraic data types (ADTs), for those unfamiliar, are an extremely powerful data structure that are useful for modelling a wide variety of types. They allow, by the composition of product types and sum types (as well as ordinary types), the modelling a wide variety of different data structures.
Nim does have algebraic data types -
objects
may be composed arbitrarily. Their application to product types is straightforward: product types are directly analogous to structs (or in Nim, standard objects). Sum types complement product types and can be thought of as their dual: they act as a wrapper around different types, letting you interact with their inner structure only by safely pattern matching upon the container. They are best represented by tagged unions, however, which Nim lacks a direct implementation of.Variant objects are capable of modelling tagged unions and thus sum types. These are useful as, well, variants of objects, but have a number of serious drawbacks when used as sum types that the first portion of this RFC aims to address.
As for pattern matching: Nim has the case statement, which alone is sufficient to make use of the inner contents of an sum type (or currently, switch on a
kind
label). But structural pattern matching is the bread and butter of ADTs: aside from simply being cleaner than nestedif
statements, it provides guarantees of exhaustiveness not otherwise given by the alternative. It is perhaps best exemplified by the ML family of languages (including Rust): but other languages have picked it up offering their own takes on it, notably Racket, Swift, and Haxe. This RFC pulls from a number of different designs to provide a powerful and Nim-esque model.Description
This RFC is two-fold. It introduces a new fundamental type,
union
, with similar semantics to Rust'senum
. It also extends Nim's existing case statement with support for structural pattern matching and particularly that of unions/objects/tuples (although it supports lists as well), and introduces a new supplementalwhere
clause in order to cleanly and powerfully do so.union
is a type that functions similarly toenum
at a surface level. It takes a list of identifiers as fields/variants/values, and an instance of a union may only ever be one of the fields. However, each field of the union may additionally be of its own entirely separate type (or none). This is best exemplified with the classic "shape" example:These fields of the unions are unordered, unqualified (via the same mechanism as overloadable enums), and unique. The inner fields are each of distinct types: entries used solely for their label (ex.
Point
) are implicitly ofvoid
type. Instantiating a union, then, is done by picking a particular variant, and providing data for the construction of its associated type (or none) as needed. The syntax of this is again best shown with an example. As a special syntax rule, tuples and objects may drop their extraneous parentheses on construction.To support proper and powerful pattern matching, we introduce three new extensions to the language: an extension for existing
case
/of
statements to support (nested) structured types (array
,seq
,tuple
,object
,union
), an extension forof
expressions in regular conditionals to support the same, and a newwhere
expression that functions as a reusable "guard". Just likeunion
, these language features combine to provide power with minimal new syntax. Following existing case semantics, this pattern matching may be exhaustive (to be described later), or non-exhaustive and followed byelif
andelse
statements.Elements of a tuple may be matched in two ways: by providing a value, which will match all tuples with that value at that location, or by binding them to a new identifier, which will match all tuples with any value at that location, and will then be made available as a variable within the branch. New variables introduced by pattern matching in this fashion follow the mutability of the matched-upon statement they are a part of. As is the case elsewhere, binding an element to the
_
identifier discards it.where
guards allow for restricting the matching of a case based on an arbitrary conditional expression. They have access to any bound fields: and so can be used to restrict variables to a value while still allowing access to that value among other, arbitrary conditions.where
clauses follow the main match section of anof
line and are considered to be attached to the main match rather than theof
itself to allow for the reuse of labels to follow existing case semantics: see the example below. Of important note is the intersection between variable bindings and repeated cases: in order for a variable to be accessible within the body of the case, it must be bound (in a consistent location) in every label of that statement. We do not disallow this at a binding level because the bindings are still helpful forwhere
clauses.The exhaustiveness of a pattern match is determined on a best-effort attempt. Fields of booleans, enums, and unions consist of few possibilities and may be exhaustively matched upon. Guards do not contribute to the exhaustivity of a search as they may contain arbitrary conditional expressions: and so must typically be followed by a case for the same match without the guard, or
else
.Named tuples and objects are matched identically to tuples, with one exception: the identifier name when binding a field must be the name of the original field. If this is undesired, the syntax
oldident: _ as newident
is introduced to allow for explicit rebinding. Unions are matched by specifying their label, and then their inner type in parenthesis. Tuples and objects within a union may elide their parenthesis.Previous thoughts on matching sequences, arrays, and sets are collapsed in the below block. I no longer think they're terribly important. They are also not updated for the new backwards-compatible
_ as x
syntax.Matching sequences, arrays, sets
10 and x[10] == true: ... # lists may be arbitrarily nested, just like every structured type of [[1, 2, 3], [4, 5, 6], [7, 8, 9]]: ... # match an arbitrary list of integers of x where typeof(x) is openarray[int]: ... # match a list of entirely 1s: of x where x.all(y => (y == 1)): ... # the following three statements match identically: of [_, _, _]: ... of x where x.len == 3: ... elif list.len == 3: ... else: ... ``` Matching sets and tables mostly follows list semantics. Both are wrapped in braces instead of brackets or parentheses. As the elements of sets and tables are never named: the `:` does not function as it does in tuples and objects, instead, it takes its regular semantics within tables to denote a key-value pair. Of additional note is the `..` operator: as sets and tables can be of unknown length it is helpful: however, as they are unordered, the `..` operator *must* come last. ```nim func sets(bitset: set[int8]) = case bitset # match the set containing exclusively 1 of {1}: ... # match a set containing at least 1 of {1, ..}: ... # match a set containing only two distinct elements of {x, y}: ... # match a set containing any two distinct elements fulfilling a condition of {x, y, ..} where x + y == 10: ... else: ... func tables(table: Table[int, string]) = case table # match a table containing only these key-value pairs of {5: "foo", 6: "bar"}: ... # match a table containing at least these key-value pairs of {5: "foo", 6: "bar", ..}: ... # match a table containing any key matching a condition of {x: y, ..} where x == y.len: ... ```While the exhaustivity of pattern matching is important: frequently you want to disregard it, and only worry about a single case. Rust provides the
if let
construct for this use case, which I find very useful but pretty gross syntactically. Instead, this RFC uses the syntaxif shape of Square(side: binding)
. The bindings become immediately available for use within the body of the if statement and within any parts of the if statement following that clause, as shown by the following examples:Following the semantics above, these extended
case
andif ident of match
statements should also allow unwrappingOption[T]
types. What this looks like is not yet specified.Code Examples
Algebraic Data Types
An example that perhaps more showcases the utility of ADTs is that of an abstract syntax tree for the simple lambda calculus. ```nim type Ident = string type Expr = ref union Constant: Term Variable: tuple[id: Ident] # Annotation: tuple[expr: Expr, kind: Type] Abstraction: tuple[param: Ident, fn: Expr] Application: tuple[fn, arg: Expr] Conditional: tuple[cond, `if`, `else`: Expr] ``` Note that other structures needed for an implementation of the simple lambda calculus (including a type system) are also efficiently and aesthetically modelled with ADTs. ```nim type Type = union Empty, Error, Unit, Boolean, Natural, Integer, Float, String Enum: seq[Type] Struct: Table[Ident, Type] Function: tuple[from, to: ref Type] type Term = union Unit Boolean: bool Natural: uint Integer: int Float: float String: tuple[len: uint, cap: uint, data: seq[uint]] Enum: tuple[val: uint, data: seq[Type]] Struct: Table[Ident, Term] Function: Expr ```Pattern Matching
```nim var globalSymbolTable = initTable[Ident, Term]() func execute(ast: Expr): Term = case ast of Constant(_ as term): return term of Variable(id): return globalSymbolTable[id] # other structures of applications + abstractions are invalid of Application(Abstraction(param, fn), arg): globalSymbolTable[param] = execute(arg) return execute(fn) of Conditional(cond, `if`: _ as ifClause, `else`: _ as elseClause): if execute(cond) == Boolean(true): return execute(ifClause) else: return execute(elseClause) else: echo "Failed to execute AST! Improper structure" quit() ```Backwards Compatibility
This RFC is almost fully backwards compatible with Nim. It will cause issues with anything using the new
union
orwhere
keywords, and with any templates or macros directly inspectingcase
orof
.A previous version of this RFC had an issue around pattern matching: the previous syntax was
of x
, wherex
is a newly bound identifier. This conflicted with the existing semantics to resolve constants in those expressions. This was fixed by changing pattern matching to require explicitly rebinding the catchall character_
, i.e._ as name
, with syntax sugar in selected cases (named tuples & structs).Notes
To end with some notes: the general design of
where
guards is to provide all of the parts of pattern matching that cannot be exhaustive, in one general construct. Hence the use ofwhere
for things regarding ranges, lengths, existing variables, generic types (although: type constraints can be exhaustive. this is specific enough that i thought it not worth special syntax).where
was also chosen over reusingif
because I wanted to emphasize that there is not a correspondingelif
/else
as it is, already, what amounts to anelif
clause within the broader case statement.I think I like
oldident: newident
for field renaming. It matches object constructors.oldident as newident
is an alternative, however.I only mentioned it briefly, but because I've seen it discussed in other implementations of pattern matching in Nim: the mutability of the introduced bindings while pattern matching follows from the mutability of the matched-upon object, that is, you're directly mutating the matched-upon object. Other languages follow this pattern and I see no reason why this + variable shadowing if you really want to make it mutable are good enough.
I'm considering allowing guards to be reusable on
for
blocks a la Swift: withfor binding in iterator where condition
acting as syntax sugar to the commonfor binding in iterator: if condition
. Is this desired? Is this good or bad for language consistency? Are there other placeswhere
clauses could be used?The
else
statement andcase _
are functionally the same. I believe this is currently the case.References and Acknowledgements
match
statementwhere
statementmatch
statementMany thanks to
fungus
: which implements a near-identicalunion
and quite similarmatch
already as macros. Many thanks topatty
: particularly for their "does not yet work" section, ha. Many thanks to Beef, for having good taste and calling my stupid ideas stupid.