Araq commented 4 years ago

This issue aims to be a more precise description about how Nim's scoping rules and overloading interact and also how syntactic constructions should be resolved. The current implementation does not follow these rules, but should unless experiments show how the rules outlined here can be improved in order to match the reality so that most Nim code out there continues to work as before.

Affected code in the compiler:

semexprs.nim
semcalls.nim
sigmatch.nim

Simple call

For a call expression f(args) first a set of candidates C(f) is created. Candidates are symbols in the current scope that are callable. A routine (proc, template, macro, iterator, etc.) is callable. Variables are callable if they have the type proc. Types are also considered callable.

Named parameters inside args lead to a further filtering step within overload solution. Then for C(f) the overload resolution is performed as described in the manual.

Module call

For a call expression module.f(args) first a set of candidates C(f) is created. Candidates are symbols in the current "module" that are callable. A routine (proc, template, macro, iterator, etc.) is callable. Variables are callable if they have the type proc.

Note: In a first analysis step it is to be decided whether module can be interpreted as a module identifier. Variables can shadow a module name.

Simple call with explicit generic instantiation

f[a](args) Candidate set as before, but further filtered against generic routines which have the right number of generic parameters. Only if this set is empty, it is interpreted as array indexing followed by the call notation ().

module.f[a](args): Analogous to the "Module call" case.

Object call

obj.f(args)

Condition: obj cannot be interpreted as a module identifier. Condition: obj.f does not have a field f or field f is not of type proc.

Rewrite rule: obj.f(args) --> f(obj, args)

Impossible: obj.f[T](args), instead the notation obj.f[:T](args) needs to be used. obj.f[:T](args) is rewritten to f[T](obj, args).

Object call without `()`

obj.f

Condition: obj cannot be interpreted as a module identifier. Condition: obj.f does not have a field f or field f is not of type proc.

Rewrite rule: obj.f --> f(obj)

Array indexing without `()`

x[args]:

x is only converted to a symchoice if there is not variable (or let or const) of this name.

Rule: Prefer generic instantiation if possible. Otherwise interpret the expression as [](x, args)

timotheecour commented 4 years ago

describing sigmatch precisely would require a dedicated page in the advanced section of the manual; there are so many rules (but they mostly all make sense, apart from a few eg optional params followed by untyped doesn't work)

[ ] > The current implementation does not follow these rules

can you summarize in top post the difference bw this spec and current implementation?

[ ] one thing missing is that (currently at least) overload resolution happens left to right and is greedy; this explains the whole comment in setops.contains:

proc contains*[T](x: set[T], y: T): bool {.magic: "InSet", noSideEffect.}
## [...] The parameters are in reverse order! ``a in b`` is a template for ``contains(b, a)``.
## This is because the unification algorithm that Nim uses for overload resolution works from left to right [...] If ``in`` had been declared as ``[T](elem: T, s: set[T])`` then ``T`` would have been bound to ``char``. But ``s`` is not compatible to type ``set[char]``! The solution is to bind ``T`` to ``range['a'..'z']``. This is achieved by reversing the parameters for ``contains``; ``in`` then passes its arguments in reverse order.

[ ] somewhat relevant to this is how current implementation implicitly resolves 0-arg templates as a call; IMO it's a design mistake that should be fixed, and it should require a pragma {.implicitcall.} eg:

template fun(): string = "foo"
doAssert fun() == "foo" # ok
doAssert fun == "foo" # should CT error
template fun2(): string {.implicitcall.} = "foo"
doAssert fun2() == "foo" # should CT error
doAssert fun2 == "foo" # ok

[ ] this description should mention {.experimental:implicitDeref|dotOperators|callOperator.} (also, right now the error msg is bad when dotOperators is enabled and a sigmatch error happens); iterators (living in a different namespace; IMO a design mistake); untyped (there are some design mistakes to fix here wrt overloading + optional params followed by untyped being unsupported)
[ ] this description should mention getters and setters, eg:
```
proc `foo=`(a: var A, b: B)
proc `foo`(a: A): B
```
and the subtle semantics involved when resolving a.foo in a module where both a getter/setter and a field foo exists

accidental ambiguity: generics vs array indexing

C++ templates can't be resolved at the parsing stage because of ambiguities bw < and << operators vs < as template
nim generics can't be resolved at parsing stage because of mod.foo[T] vs obj.foo[T], thus requiring [: special case
D doesn't have this ambiguity, and identifying D templates is resolved during parsing eg foo!T foo!(T1, T2). Ugly or not is subjective, but I think it's worth considering as it would simplify things; it doesn't have to be !

Araq commented 4 years ago

Ugly or not is subjective, but I think it's worth considering as it would simplify things; it doesn't have to be !

Agreed, but the syntax [: ] is not ambiguous either.

Araq commented 4 years ago

can you summarize in top post the difference bw this spec and current implementation?

Well the implementation does it in an undisciplined manner. See for example semexprs.shouldBeBracketExpr, or this whole section of code


  of nkCall, nkInfix, nkPrefix, nkPostfix, nkCommand, nkCallStrLit:
    # check if it is an expression macro:
    checkMinSonsLen(n, 1, c.config)
    #when defined(nimsuggest):
    #  if gIdeCmd == ideCon and c.config.m.trackPos == n.info: suggestExprNoCheck(c, n)
    let mode = if nfDotField in n.flags: {} else: {checkUndeclared}
    var s = qualifiedLookUp(c, n[0], mode)
    if s != nil:
      #if c.config.cmd == cmdPretty and n[0].kind == nkDotExpr:
      #  pretty.checkUse(n[0][1].info, s)
      case s.kind
      of skMacro, skTemplate:
        result = semDirectOp(c, n, flags)
      of skType:
        # XXX think about this more (``set`` procs)
        let ambig = contains(c.ambiguousSymbols, s.id)
        if not (n[0].kind in {nkClosedSymChoice, nkOpenSymChoice, nkIdent} and ambig) and n.len == 2:
          result = semConv(c, n)
        elif ambig and n.len == 1:
          errorUseQualifier(c, n.info, s)
        elif n.len == 1:
          result = semObjConstr(c, n, flags)
        elif s.magic == mNone: result = semDirectOp(c, n, flags)
        else: result = semMagic(c, n, s, flags)
      of skProc, skFunc, skMethod, skConverter, skIterator:
        if s.magic == mNone: result = semDirectOp(c, n, flags)
        else: result = semMagic(c, n, s, flags)
      else:
        #liMessage(n.info, warnUser, renderTree(n));
        result = semIndirectOp(c, n, flags)
    elif (n[0].kind == nkBracketExpr or shouldBeBracketExpr(n)) and
        isSymChoice(n[0][0]):
      # indirectOp can deal with explicit instantiations; the fixes
      # the 'newSeq[T](x)' bug
      setGenericParams(c, n[0])
      result = semDirectOp(c, n, flags)
    elif isSymChoice(n[0]) or nfDotField in n.flags:
      result = semDirectOp(c, n, flags)
    else:
      result = semIndirectOp(c, n, flags)

and then semIndirectOp also handles nkDotExpr (which is not an indirect call at all), and re-runs semExpr if semFieldAccess produces a nkDotCall. I'm not saying that the code is wrong (it surely is most ugly though), I'm saying there is no spec for it, no clear rules how this really works.

disruptek commented 4 years ago

Any reason not to start with the tests? Then we can hash out all the concerns and agree on desirable behaviors.

Araq commented 4 years ago

What do you mean by "start with the tests"? We have tests, they are green. They don't help us all that much to write the spec.

timotheecour commented 4 years ago

I've clarified what I meant above by untyped (there are some design mistakes to fix here wrt overloading + optional params followed by untyped being unsupported), see https://github.com/nim-lang/Nim/issues/14346

that's near the top of my wish list for sigmatch improvements.

nim-lang / compilerdev

Scoping and overload resolution #5

Simple call

Module call

Simple call with explicit generic instantiation

Object call

Object call without `()`

Array indexing without `()`

accidental ambiguity: generics vs array indexing

nim-lang / compilerdev

Scoping and overload resolution #5

Simple call

Module call

Simple call with explicit generic instantiation

Object call

Object call without ()

Array indexing without ()

accidental ambiguity: generics vs array indexing

Object call without `()`

Array indexing without `()`