nim-lang / RFCs

A repository for your Nim proposals.
135 stars 26 forks source link

Rename case statement macro from match to `case` #332

Closed metagn closed 1 year ago

metagn commented 3 years ago

https://nim-lang.github.io/Nim/manual_experimental.html#case-statement-macros

The macro required to implement a custom case statement has to be named match. This is a very common name for macros and is usually used directly instead of via case. Nim allows stropping keywords for routine names. So, I propose match should be changed to `case`. The example becomes:

{.experimental: "caseStmtMacros".}

import macros

macro `case`(n: tuple): untyped =
  result = newTree(nnkIfStmt)
  let selector = n[0]
  for i in 1 ..< n.len:
    let it = n[i]
    case it.kind
    of nnkElse, nnkElifBranch, nnkElifExpr, nnkElseExpr:
      result.add it
    of nnkOfBranch:
      for j in 0..it.len-2:
        let cond = newCall("==", selector, it[j])
        result.add newTree(nnkElifBranch, cond, it[^1])
    else:
      error "custom 'case' for tuple cannot handle this node", it
  echo repr result

case ("foo", 78)
of ("foo", 78): echo "yes"
of ("bar", 88): echo "no"
else: discard

This experimental feature seems to be seldom used as people prefer to just use custom matching macros. This would help outline the difference and rid any incompatibilities (n: tuple is not actually AST of a tuple, it's the AST of a case statement). Further improvements could be done to this feature, for example changing the arguments (n: tuple) to (n: tuple, branches: varargs[untyped]), but I think renaming match to `case` first will make further changes much more straightforward.

Relevant line in compiler:

https://github.com/nim-lang/Nim/blob/25c75752d07fccc00682890875f3ce6c13f61d90/compiler/semstmts.nim#L867

Araq commented 3 years ago

It surely does look like a good idea.

juancarlospaco commented 3 years ago

Will this break the pattern matching macro ?. Easy to fix tho... :slightly_smiling_face:

metagn commented 3 years ago

Will this break the pattern matching macro ?.

Yes most likely, a compromise could be made where both case and match are checked for a bit, but since it's an experimental feature I don't think there's any need to bother with stuff like that. A small PR to fusion will be needed but I don't know if fusion really accepts PRs on existing modules?

Araq commented 3 years ago

We should make them non-experimental while we're at it.

haxscramper commented 3 years ago

match macro itself is not really exposed on pattern matching implementation, so it is unlikely to break anything, and the fix is literally one-line change. This also continues this trend of 'magical' names that are automatically picked up by implementation (like fld=, =destroy that have to be enclosed in backticks) which is also good for consistency reasons

I haven't found any bugs related to case statement macro, so I could say this feature is ready to be enabled by default.

timotheecour commented 3 years ago

unlikely to break anything

https://github.com/nim-lang/Nim/pull/16923#issuecomment-778085483

timotheecour commented 3 years ago

@Araq @hlaaftana Doesn't this change create an ambiguity?

I have caseStmtMacros macros that defines custom logic (not hypothetical, I have valid use cases), for handling strings: previously I could write:

match "abc"
of "foo": discard
of "bar": discard
else: discard

and then I'd know this would pickup the custom one. After this change, it's unclear whether the custom one or builtin one is used; if you have imports and compilation in the mix, this gets worse.

The fact you can only define the macro via a single name (match, or now, case) seems IMO like a spec bug. This also prevents defining 2 different case statement logics for the same type.

proposal

implement caseStmtMacros like ForLoopStmt, as follows:

# for
macro customFor(x: ForLoopStmt): untyped = ...
for a in [1,2].customFor: ...

# case OLD STYLE, deprecated
macro `case`(n: tuple): untyped =
case "foo"
of "foo1": discard
else: discard

# case NEW STYLE
macro customCase(n: CaseStmt): untyped = ...
case "foo".customCase
of "foo1": discard
else: discard

this seems strictly like a better idea:

metagn commented 3 years ago

I don't think caseStmtMacros is supposed to be able to have custom logic for types that Nim natively supports for case (integers and strings I think):

import macros

{.experimental: "caseStmtMacros".}

macro match(x: int) =
  # or x: untyped
  echo x.repr
let x = 3

case x
of 3:
  echo "good"
else:
  echo "???"
# output: good

The existing match(x: T) signature makes the case-statement-overloading version of it to be called like match x:. Since match is usually a macro that you would want to call like that, it being renamed to `case` is better.

CaseStmt by itself wouldn't work for overloading, you would need CaseStmt[T]. But I don't even think the current behavior is limited enough to the point where you need it, you can do this:

type CustomCase = distinct string # or CustomCase[T] = distinct T. actually works

template customCase(str: string): CustomCase = CustomCase(str)

macro `case`(customCase: CustomCase): untyped = # ...

case "foo".customCase
of "foo1": discard
else: discard
Araq commented 3 years ago

implement caseStmtMacros like ForLoopStmt, as follows

I originally considered that but decided that the use cases are different enough not to bother with this idea. But maybe I was wrong.

timotheecour commented 3 years ago

I don't think caseStmtMacros is supposed to be able to have custom logic for types that Nim natively supports for case (integers and strings I think)

that would be a very odd restriction; there are valid use cases for custom case for such types, eg: allowing runtime strings in branches, which builtin case doesn't support.

when defined case4:
  let val1 = "foo1"
  let val2 = "foo2"
  case "foo1".customCase
  of val1: discard
  of val2: discard
  else: discard

I don't even think the current behavior is limited enough to the point where you need it, you can do this:

this is too complicated, see https://gist.github.com/timotheecour/f7a847b0f0e9909efd6f7ee85420c4be for a working but ugly example using this, it requires a lot of silly boilerplate, eg:

when true:
   import macros
  from typetraits import distinctBase
  {.experimental: "caseStmtMacros".}

  type CustomCase1[T]= distinct T
  template customCase1[T](a: T): CustomCase1[T] = CustomCase1[T](a)
  proc `==`[T](a: CustomCase1[T], b: T): bool = a.distinctBase == b # {.borrow.} won't work
  # more proc's may need to be defined in more complex cases just to support `distinct` approach
  macro `case`(n: CustomCase1): untyped = ...

it also prevents conditionally importing symbols, eg via from foo import `case` since case would be the name shared by all case statement macros.

Instead with the proposal https://github.com/nim-lang/RFCs/issues/332#issuecomment-778106052, those problems are avoided and it's much simpler:

when true:
   import macros
  {.experimental: "caseStmtMacros".}
  macro customCase(n: CaseStmt): untyped = ...

CaseStmt by itself wouldn't work for overloading, you would need CaseStmt[T].

my modified proposal is as follows:

when true:
   import macros
  {.experimental: "caseStmtMacros".}
  macro customCase[T](a: T, b: CaseStmt): untyped = ...

then you have everything you want:

note

the same is true for ForLoopStmt, we should IMO allow this:

macro customFor[T](a: T, b: ForLoopStmt): untyped = ...

neither of what I've suggested is a breaking change btw.

metagn commented 3 years ago

That would be great for ForLoopStmt, assuming it allows multiple arguments (and enumerate would become enumerate(untyped, ForLoopStmt)), but here I don't see why you wouldn't just do:

macro customCase(a: string, branches: varargs[untyped]) = ...

customCase "foo1":
of val1: discard
of val2: discard
else: discard

You literally save characters by not having to type case. With ForLoopStmt you can use macros where they couldn't be used, but we have the facilities to mirror pretty much all of case syntax, and I don't see the need for being able to type the preceding case here. I think runtimeCase "foo1": is more expressive than case "foo1".runtimeCase or case runtime "foo1" or whatever. It could just be me and I could be missing where this would be desired, but I would really like to know any such place.

timotheecour commented 3 years ago

TIL about the fact you could do that, it's pretty interesting; this means that:

Note also that macro customCase(a: string, branches: varargs[untyped]) also is easier to grok, you have a clean separation bw the type (a in this case), and the branches; and you can specify in declaration whether to return untyped, or some type, or void

example 1

https://github.com/nim-lang/Nim/issues/8821 could potentially be fixed in library code instead of in the compiler (jsgen)

JS codegen can produce extreme switch statements with case a of range

(or, if we don't change the builtin, we can now at least provide a suitable library defined alternative that would fix this)

example 2

it'd allow accessing the value we're case-ing over, using custom logic defined in library code:

case foo.bar[2]:
of 1..2: echo caseVal # accesses `foo.bar[2]`
metagn commented 3 years ago

potentially, we don't even need builtin case anymore

Object variants. Not the prettiest part of Nim (though the idea is good), but it is meant to be tied to case. The only difference to regular case is it doesn't support strings.

That feature is "post-expression blocks", I don't know where I learned it from, in the documentation it's only in the grammar. On top of of and else, it accepts elif, except and do (even with params), and in 1.4.4+ finally. Sorry if I came off a bit rude, I didn't know it was this hidden.

haxscramper commented 3 years ago

case (as well as custom case macro) can be used as an expression, but customMatch .. of at the moment can't and fail parsing when of is indented (which is a valid way to write case ... of)

import std/[macros]
macro customCase(a: string, branches: varargs[untyped]) = newLit(12)

let res1 = case "foo1":
  of "12": 12
  of "13": 12
  else: 22

# expression '12' is of type 'int literal(12)' and has to be used (or discarded)
let res2 = customCase "foo1":
of val1: 12
of val2: 12
else: 22
# invalid indentation error
customCase "foo1":
  of val1: discard
  of val2: discard
  else: discard

If those two are fixed, I would prefer customCase too, instead of caseStmtMacros, but otherwise it would add too much inconsistency.

I think the original goal of this RFC

Rename case statement macro from match to case

is more than enough, though support for varargs[untyped] macros that can handle indented of (and multiple elif/else) branches, as well as used in expressions could be quite useful, but would require changing grammar slightly etc. etc., so probably out of the scope of this RFC.

timotheecour commented 3 years ago

case (as well as custom case macro) can be used as an expression, but customMatch .. of at the moment can't

that's not true, see below which works; you just need to add a return type to your macro, eg:

when true:
  import macros
  macro customCase(a: string, branches: varargs[untyped]): untyped = newLit(16)
  let res2 = customCase "foo1":
  of val1: 12
  of val2: 12
  else: 22
  doAssert res2 == 16

The only thing missing is allowing an indentation, which IMO is a minor point I can live without (idiomatic style is to not indent case..of), and in any case could be supported in future in the parser, if needed.

I think the original goal of this RFC ... is more than enough

my point is we can deprecate caseStmtMacros without any loss of functionality, already with the features we have today.

Araq commented 3 years ago

my point is we can deprecate caseStmtMacros without any loss of functionality, already with the features we have today.

Yes, and I knew this when I added caseStmtMacros, however:

metagn commented 3 years ago

Closing, issues for customCase being unable to use as expression, indented post-expr blocks, ForLoopStmt with type can be opened separately.

timotheecour commented 3 years ago

customCase being unable to use as expression

That's not true, see https://github.com/nim-lang/RFCs/issues/332#issuecomment-778539284 which has working code. The other point (customCase x is less idiomatic than case x) has more merit.

this discussion was prematurely closed. The current situation is not good:

proposal

we should do the proposal I gave earlier in https://github.com/nim-lang/RFCs/issues/332#issuecomment-778106052 which is better in all respects than the current caseStmtMacro

As mentioned earlier, it follows the ForLoopStmt design (with the modification described earlier and that can also apply to ForLoopStmt so that it can be overloaded by type)

metagn commented 3 years ago

fusion is still broken in devel because of this change

It's a 2 line fix. Anyone can do it.

macro `case`*(n: untyped): untyped =
  result = getAst(match(n))

If you're saying that there's a problem with match being renamed to case, then I don't know what to tell you, fusion not fixing it is fusion's problem. And fusions development cycle or whether or not we should get rid of case statement macros and add more syntax hooks are overgeneralizations. I think 18 comments are too many on an issue about changing a single string in the compiler.

timotheecour commented 3 years ago

It's a 2 line fix. Anyone can do it.

as i said above: fixing it for devel will break 1.4.x unless that (compiler) change is backported. If you don't backport the compiler change to 1.4 (or in fact all the way to 1.0), then you'll end up with fragmention: case (1,2) would work in nim >= 1.5.1 but not in nim < 1.5.1, and match (1,2) would work in nim < 1.5.1 but not nim >= 1.5.1. So it's not just a 2 line fix. But that's not my main point.

If you're saying that there's a problem with match being renamed to case,

that's not my point. My point is that this rename match => case is the wrong fix. The right fix is to support caseStmtMacro properly via implementing compiler support for macro customCase[T](a: T, b: CaseStmt): untyped = ..., in a way that does't leave this feature un-necessarily crippled.

metagn commented 3 years ago

match (1,2) would work in nim < 1.5.1 but not nim >= 1.5.1.

It would never work?

{.experimental: "caseStmtMacros".}

import macros

macro match(x: (int, int)): untyped =
  echo x.treeRepr

match (1, 2):
of (2, 3): discard
(8, 7) Error: type mismatch: got <(int, int), void>
but expected one of: 
macro match(x: (int, int)): untyped
  first type mismatch at position: 2
  extra argument given

expression: match (1, 2),
of (2, 3):
  discard
timotheecour commented 3 years ago

It would never work?

i meant:

so caseStmtMacros need to be wrapped so they work with both, as I did here https://github.com/nim-lang/fusion/pull/73 which should fix the fusion regression at least, but the real fix is IMO implementing macro customCase[T](a: T, b: CaseStmt): untyped =

metagn commented 1 year ago

To hopefully carry the discussion here at least a tiny bit more toward closure after all this time, this current working syntax wasn't mentioned:

match case a
of b: discard
else: discard

# with parentheses, maybe could be made more lenient in some places
x = match(case a
of b: 1
else: 2)
foo match(case a
of b: 1
else: 2)

# works with this syntax too
match case a:
  of b: discard
  else: discard

# can even add arguments, whatever this means
foo arg, case a
of b: discard
else: discard
let x = foo(arg, case a
of b: 1
else: 2)

This isn't good enough for standard pattern matching (for which the plans seem to have shifted to putting at language level anyway), but cheap custom match implementations can easily use this and not have to post expression block stuff.

The same logic can be applied to for loop macros, which is essentially the same as what sugar.collect does:

let a = [10, 20, 30]
var b: seq[(int, int)]
enumerate for i, x in a:
  b.add((i, x))
assert b == @[(0, 10), (1, 20), (2, 30)]

let c = "abcd"
var d: seq[(int, char)]
enumerate 97, for (i, x) in c:
  d.add((i, x))
enumerate(start = 97, for (i, x) in c:
  d.add((i, x)))
assert d == @[(97, 'a'), (98, 'b'), (99, 'c'), (100, 'd')]

unroll for i in 1..5:
  const j = i
  echo j

For the sake of the argument, assume no one thinks these are ugly and statement-syntax-as-expressions is considered standard (or there is some new, again not considered ugly, unambiguous syntax like enumerate(97) do for (i, x) in c:). These collective examples would intersect with the following features:

  1. For loop macros: IMO, as they are, this syntax is better, because it's at least honest that it doesn't compose iterators. I'm partial to the "expression type" idea in for loop macros though (it sets the ground for things like VarStmt[int]).

  2. Post-expression blocks: These aren't just used for of/elif/else, there's also do, lambda do, except/finally etc. I don't think these are harmful, and their intersection here is a logical continuation of what they are needed for otherwise. Probably not essential, but is too much trouble to remove.

  3. NOT case statement macros IMO, because even if it wasn't their original intended use, case statement macros are typed, while in this case they would have to be untyped. And I still think typed case statement macros are useful, for example with conditional compilation (i.e. type FakeInt64 = array[2, uint32]).

So stuff like case customMatch(x), CaseStmt etc would also be obsolete.

So I am personally rejecting whatever reason this RFC was reopened 2 times for, and hoping people are satisfied with what they have now, and if they're not they can make their own new RFC or work on language level pattern matching or whatever. Or bump/review RFCs and PRs with similar ideas like https://github.com/nim-lang/Nim/pull/19383.