Overloadable enum names

beef331 commented 3 years ago

Following my dud of a PR https://github.com/nim-lang/Nim/pull/17897, making a RFC as suggested.

Given that the information is there, it seems sensible to pass the type information from the pertinent parts of statements to make the user not have to explictly specify the type so much. It is especially useful for enums, where you have many duplicate enum names and dont want to add the prefix notation (Treeform can attest it's not nice :smile: ).

The following is atleast some subset of what this type inference could enable.

type
  A {.pure.} = enum
    left, right, up, down
  B {.pure.} = enum
    left, right, up, down
  Test = object
    case kind: A
    of left:
      a: int
    of {right, down}:
      b: float
    else:
      c: string
  ASet = set[A]

let test = A.left
case test:
of left: discard
of up: discard
else: discard

var
  a: array[4, A] = [left, right, down, up]
  b: array[4, B] = [left, right, down, up]
  c: ASet = {left..right, down, up}
  d: A = left
  e: array[4, byte] = [255, 2, 3, 4]

treeform commented 3 years ago

I run into this issue all the time:

Its strange to me that most of the language has no prefixes, but for enums they are required.
Many web techs (HTML, CSS, SVG, CANVAS) have common words like "left", "right", "center". But when translated to nim the enums can't be translated 1:1 and have to use some thing like htmlLeft, cssLeft, svgLeft, canvasLeft ... its crazy!
While working on fidget I have the same problem. I have figet left, pixie left, typography left etc...
This happens with Node and parsing too, wile I have thrift nkBool, json nkBoo, sql nkBool etc...

I run into this issue all the time having to prefix everything and not having it match with the APIs I am communicating with. And the annoying thig is that nim could perfectly tell the enums apart. There is almost never any ambiguity, if its a html.align = left its clear I want the html's left and not the css left. It feels trivial. If I do want to differentiate I can always use the type CssAlign.left.

Thank you for putting this RFC together.

Araq commented 3 years ago

I don't see how prefixes are required for enums, it's an archaic convention in Nim's stdlib but apart from that, nobody asks you to use prefixes. I personally moved to non-pure UpperCased enum field names and it works well. Ymmv of course.

treeform commented 3 years ago

It appears that prefixes are required because of enum collision see here:

type  
  HAlignMode* = enum
    Default
    Left
    Center
    Right    

  VAlignMode* = enum
    Default
    Top
    Middle
    Bottom

Error: redefinition of 'Default'; previous declaration here: /usercode/in.nim(3, 5)

https://play.nim-lang.org/#ix=3lyx

I don't want to use {.pure.} because its basically a bigger prefix. Maybe its better than these weird prefix two letter codes? They feel like Hungarian notation which makes me always think WIN32 api.

beef331 commented 3 years ago

It's not that prefix is required but they're the more ergonomic method of writting enums in my view. Reducing the amount you have to specify the type. To replicate a possible collision, imagine the following were defined in different modules or packages.

  CssAlign {.pure.} = enum
    left, right, center
  Alignment {.pure.} = enum
    left, right, center
  Justify {.pure.} = enum
    left, right, center

Presently we need to do CssAlign.left or whatever the enum type we want so for example

let yourVal = CssAlign.left
case yourVal:
of CssAlign.left: ...
of CssAlign.right: ...
of CssAlign.center: ...

In this example we had to declare the enum's type 4 sperate times, so if we use caLeft we save the 4 types, but they're pratically included in the name instead of a dot operator. Which seems more ergonomic and less redundant than to use Enum.val. Though to me the prefix notation is akin to doing u8 everywhere you have a literal that can be inferred to be a byte. For instance being forced to do

case 10u8
of 100u8..200u8: discard
of 254u8..255u8: discard
else: discard

instead

case 10u8
of 100..200: discard
of 254..255: discard
else: discard

saem commented 3 years ago

It's not just here which is simply a better experience for folks, which is a good thing.

The technique of propagating expected type information can be used for additional improvements in inference/disambiguation, this is a very interesting direction. It also seems to be a good vector for bringing in new folks to the compiler.

As long as there is a principle(s) for the direction in which information the type information propagates (which this doesn't violate existing rules), then this and other such improvements can move forward.

Araq commented 3 years ago

Here is how it can work:

Enum field names are overloadable much like routines. When an overloaded enum field is used, produces a nkClosedSymChoice construct, usually written as (E|E)
During overload resolution the right E is picked, if possible.
For (array/object...) constructors the right E is picked, comparable to how [byte(1), 2, 3] works, you would need to use [T.E, E2, E3].
Ambiguous enum fields produce an error.

treeform commented 3 years ago

Another example from my pretty print library:

Current way, Hungarian notation very C/win32 like.

  case node.kind:
    of nkNumber, nkBool:
      printStr(fgBlue, node.value)
    of nkRepeat, nkNil, nkPointer:
      printStr(fgRed, node.value)
    of nkProc:
      printStr(fgMagenta, node.value)
    of nkString:
      printStr(fgGreen, node.value.escapeString())
    of nkChar:
      printStr(fgGreen, "'" & node.value.escapeString()[1..^2] & "'")
    of nkSeq, nkArray:

I want: fees very clean and Nim like.

  case node.kind:
    of Number, Bool:
      printStr(Blue, node.value)
    of Repeat, NilPointer, Pointer:
      printStr(Red, node.value)
    of Procedure:
      printStr(Magenta, node.value)
    of String:
      printStr(Green, node.value.escapeString())
    of Char:
      printStr(Green, "'" & node.value.escapeString()[1..^2] & "'")
    of Seq, Array:

I don't want, feels very bureaucratic/java like, tons of repeating names hiding the code:

  case node.kind:
    of PrintNodeKind.Number, PrintNodeKind.Bool:
      printStr(ForegroundColor.Blue, node.value)
    of PrintNodeKind.Repeat, PrintNodeKind.Nil, PrintNodeKind.Pointer:
      printStr(ForegroundColor.Red, node.value)
    of PrintNodeKind.Proc:
      printStr(ForegroundColor.Magenta, node.value)
    of PrintNodeKind.String:
      printStr(ForegroundColor.Green, node.value.escapeString())
    of PrintNodeKind.Char:
      printStr(ForegroundColor.Green, "'" & node.value.escapeString()[1..^2] & "'")
    of PrintNodeKind.Seq, PrintNodeKind.Array:

treeform commented 3 years ago

b3liever proposed this code is fine :

const
  pagesToCstring: array[PageMode, cstring] = mapLiterals([
    PageMode.none: "home",
    PageMode.home: "home",
    PageMode.about: "about-us",
    PageMode.media: "media-kit",
    PageMode.help: "help",
    PageMode.cookies: "cookies",
    PageMode.privacy: "privacy-policy",
    PageMode.terms: "user-terms",
    PageMode.account: "account",
    PageMode.cart: "cart",
    PageMode.checkout: "checkout",
    PageMode.error404: "error404",
  ], cstring)

(From https://forum.nim-lang.org/t/7983#50950)

I think this code is better:

const
  pagesToCstring: array[PageMode, cstring] = mapLiterals([
    None: "home",
    Home: "home",
    About: "about-us",
    Media: "media-kit",
    Help: "help",
    Cookies: "cookies",
    Privacy: "privacy-policy",
    Terms: "user-terms",
    Account: "account",
    Cart: "cart",
    Checkout: "checkout",
    Error404: "error404",
  ], cstring)

In Nim we rely on pretty type inference we should allow it to be used in more places.

al6x commented 3 years ago

+1 This improvement is very much needed.

Having to type CssAlign.left instead of just left is very inconvenient. I have enum name conflicts all the time and had to use prefixes.

P.S.

I think this code is even better, it's TypeScript using LiteralType, and the correct enum names usage will be validated at compile time, but I'm not willing to push Nim enum that far, just the ability to resolve enum names conflict would be good enough :)

type Pages = "home" | "about-us" | "media-kit" | "help" | 
  "cookies" | "privacy-policy" | "user-terms" | "account" | "cart" | 
  "checkout" | "error404"

bpr commented 3 years ago

Some approaches from other languages: Ada: enumeration literals are overloadable. I think this is what Araq is proposing. Rust: treats the enum as being a scope and uses the use directive to open it. OCaml: has both solutions, scopes and a different "overloadable" enum-ish thing called polymorphic variants. When I used OCaml I found that programmers overused polymorphic variants to get the benefits of overloading.

Araq commented 3 years ago

Ada: enumeration literals are overloadable. I think this is what Araq is proposing.

Correct as I cannot think of a more natural solution for Nim.

bpr commented 3 years ago

Any of the approaches would allow @treeform to write the cleaner version he prefers. I agree that overloading is the natural approach for Nim, as the Rust approach would require some new local scoping mechanisms.

beef331 commented 3 years ago

Another place this could be useful is in expressions, take the following into consideration.

proc bar = discard

proc foo: proc() =
  result = proc() =
    discard

proc doSomething: proc() =
  case true:
  of true: bar
  of false: foo()

Presently this errors at of false: foo() due to it being a closure and bar not being one as the case statement is looking for proc(). Given the information that the expression needs to be a proc(){.closure.} the error would be moved to the correct line which is the of true: bar there is no converter or way for it to presently be turned into the correct return type.

Araq commented 3 years ago

@beef331 This is true but completely unrelated to the enum situation. This RFC should be split up into different parts. This RFC should be called "Overloadable enum names"

treeform commented 3 years ago

I think this is same thing: https://github.com/nim-lang/RFCs/issues/8

treeform commented 3 years ago

We run into exact same in again Pixie: We want to mimic how JS canvas API works, but it has "round" for both lineJoin and lineCap... We have to resort to ugly win32 inspired Hungarian notion ljRound and lcRound. I wish we could just use Round for both enums and compiler can just infer a little more than it does now.

al6x commented 3 years ago

I finally found more or less nice and clean way to use enums, with json-like helper. It solves 5 issues of Nim.

1) Nice and clean names for enum values, like "string". 2) No conflicts with build-in language keywords for enum values, like "string". 3) No conflict with built-in language keywords for key names, like type. 4) Dealing with optional values, no need for .some 4) Auto-casting nested objects, if you use converters to autocast strings to enums (and other non-enum converters too), it won't work for nested objects, but it will work here.

This example has larger scope than this issue focussed on enums, but I would like to highlight the wole use case. Why these features needed.

plot "/test_table.json", rows, jo {
  columns: [
    { id: "name", type: "string" }, # <= "string" going to be casted to enum
                                    #    and reserved word "string" used for enum value
    { id: "age",  type: "number" }  # <= reserved word "type" used for key name
  ]
}

type PlotDataType* {.pure.} = enum
  string_e  = "string",
  number_e  = "number",
  boolean_e = "boolean",
  unknown_e = "unknown"

type PlotTableColumn* = object
  id*:      string
  `type`*:  PlotDataType

proc plot*[Row](path: string, rows: seq[Row], options: TableOptions): void =
  discard

Code for jo helper

P.S.

I didn't invented this example just to point to those Nim issue. This is real code that I'm converting from some JavaScript table/plotting library.

Araq commented 3 years ago

This has been implemented.

derekdai commented 3 years ago

I tried the example in #8, it works

type
  Enum1 = enum
    A, B, C
  Enum2 = enum
    A, Z

proc f(e: Enum1): int = ord(e)
proc g(e: Enum2): int = ord(e)

proc h(e: Enum1): int = ord(e)
proc h(e: Enum2): int = ord(e)

let fA = f(A) # Type of A is well defined
let gA = g(A) # Same as above

let hA1 = h(Enum1.A) # A requires disambiguation
let hA2 = h(Enum2.A) # Similarly
let x = ord(Enum1.A) # Also

But won't work with case of

{.experimental: "overloadableEnums".}

type
  A {.pure.} = enum
    left, right, up, down
  B {.pure.} = enum
    left, right, up, down

let test = A.left
case test:
of left: discard
of up: discard
else: discard

Error: ambiguous identifier: 'left' -- use one of the following:
  A.left: A
  B.left: B

nim version

$ nim --version
Nim Compiler Version 1.5.1 [Linux: amd64]
Compiled at 2021-10-02
Copyright (c) 2006-2021 by Andreas Rumpf

git hash: 5c4692fad4c807a010d58d553e92360128699197
active boot switches: -d:release

beef331 commented 3 years ago

@derekdai pure still forces ambigious enums into a namespace, so of course it will not work, overridable enums is a replacement to pure.

al6x commented 2 years ago

It seems like enum overload works for parameter and assignment but not for initialisation? play

{.experimental: "overloadableEnums".}

type
  E1 = enum a, b
  E2 = enum a, b

proc test(e: E1) = echo e
test(a)        # Works, parameter

var v1 = E1.a
v1 = a         # Works, assignment

let v2: E1 = a # Error?

derekdai commented 2 years ago

@al6x top-down type inference solve this elegantly.

nim-lang / RFCs

Overloadable enum names #373