Open Araq opened 4 years ago
describing sigmatch
precisely would require a dedicated page in the advanced section of the manual; there are so many rules (but they mostly all make sense, apart from a few eg optional params followed by untyped doesn't work)
can you summarize in top post the difference bw this spec and current implementation?
[ ] one thing missing is that (currently at least) overload resolution happens left to right and is greedy; this explains the whole comment in setops.contains:
proc contains*[T](x: set[T], y: T): bool {.magic: "InSet", noSideEffect.}
## [...] The parameters are in reverse order! ``a in b`` is a template for ``contains(b, a)``.
## This is because the unification algorithm that Nim uses for overload resolution works from left to right [...] If ``in`` had been declared as ``[T](elem: T, s: set[T])`` then ``T`` would have been bound to ``char``. But ``s`` is not compatible to type ``set[char]``! The solution is to bind ``T`` to ``range['a'..'z']``. This is achieved by reversing the parameters for ``contains``; ``in`` then passes its arguments in reverse order.
[ ] somewhat relevant to this is how current implementation implicitly resolves 0-arg templates as a call; IMO it's a design mistake that should be fixed, and it should require a pragma {.implicitcall.} eg:
template fun(): string = "foo"
doAssert fun() == "foo" # ok
doAssert fun == "foo" # should CT error
template fun2(): string {.implicitcall.} = "foo"
doAssert fun2() == "foo" # should CT error
doAssert fun2 == "foo" # ok
[ ] this description should mention {.experimental:implicitDeref|dotOperators|callOperator.}
(also, right now the error msg is bad when dotOperators
is enabled and a sigmatch error happens); iterators (living in a different namespace; IMO a design mistake); untyped (there are some design mistakes to fix here wrt overloading + optional params followed by untyped being unsupported)
[ ] this description should mention getters and setters, eg:
proc `foo=`(a: var A, b: B)
proc `foo`(a: A): B
and the subtle semantics involved when resolving a.foo
in a module where both a getter/setter and a field foo
exists
<
and <<
operators vs <
as templatemod.foo[T]
vs obj.foo[T]
, thus requiring [:
special casefoo!T
foo!(T1, T2)
. Ugly or not is subjective, but I think it's worth considering as it would simplify things; it doesn't have to be !
Ugly or not is subjective, but I think it's worth considering as it would simplify things; it doesn't have to be !
Agreed, but the syntax [: ]
is not ambiguous either.
can you summarize in top post the difference bw this spec and current implementation?
Well the implementation does it in an undisciplined manner. See for example semexprs.shouldBeBracketExpr
, or this whole section of code
of nkCall, nkInfix, nkPrefix, nkPostfix, nkCommand, nkCallStrLit:
# check if it is an expression macro:
checkMinSonsLen(n, 1, c.config)
#when defined(nimsuggest):
# if gIdeCmd == ideCon and c.config.m.trackPos == n.info: suggestExprNoCheck(c, n)
let mode = if nfDotField in n.flags: {} else: {checkUndeclared}
var s = qualifiedLookUp(c, n[0], mode)
if s != nil:
#if c.config.cmd == cmdPretty and n[0].kind == nkDotExpr:
# pretty.checkUse(n[0][1].info, s)
case s.kind
of skMacro, skTemplate:
result = semDirectOp(c, n, flags)
of skType:
# XXX think about this more (``set`` procs)
let ambig = contains(c.ambiguousSymbols, s.id)
if not (n[0].kind in {nkClosedSymChoice, nkOpenSymChoice, nkIdent} and ambig) and n.len == 2:
result = semConv(c, n)
elif ambig and n.len == 1:
errorUseQualifier(c, n.info, s)
elif n.len == 1:
result = semObjConstr(c, n, flags)
elif s.magic == mNone: result = semDirectOp(c, n, flags)
else: result = semMagic(c, n, s, flags)
of skProc, skFunc, skMethod, skConverter, skIterator:
if s.magic == mNone: result = semDirectOp(c, n, flags)
else: result = semMagic(c, n, s, flags)
else:
#liMessage(n.info, warnUser, renderTree(n));
result = semIndirectOp(c, n, flags)
elif (n[0].kind == nkBracketExpr or shouldBeBracketExpr(n)) and
isSymChoice(n[0][0]):
# indirectOp can deal with explicit instantiations; the fixes
# the 'newSeq[T](x)' bug
setGenericParams(c, n[0])
result = semDirectOp(c, n, flags)
elif isSymChoice(n[0]) or nfDotField in n.flags:
result = semDirectOp(c, n, flags)
else:
result = semIndirectOp(c, n, flags)
and then semIndirectOp
also handles nkDotExpr
(which is not an indirect call at all), and re-runs semExpr
if semFieldAccess
produces a nkDotCall
. I'm not saying that the code is wrong (it surely is most ugly though), I'm saying there is no spec for it, no clear rules how this really works.
Any reason not to start with the tests? Then we can hash out all the concerns and agree on desirable behaviors.
What do you mean by "start with the tests"? We have tests, they are green. They don't help us all that much to write the spec.
I've clarified what I meant above by untyped (there are some design mistakes to fix here wrt overloading + optional params followed by untyped being unsupported)
, see https://github.com/nim-lang/Nim/issues/14346
that's near the top of my wish list for sigmatch improvements.
This issue aims to be a more precise description about how Nim's scoping rules and overloading interact and also how syntactic constructions should be resolved. The current implementation does not follow these rules, but should unless experiments show how the rules outlined here can be improved in order to match the reality so that most Nim code out there continues to work as before.
Affected code in the compiler:
Simple call
For a call expression
f(args)
first a set of candidates C(f) is created. Candidates are symbols in the current scope that are callable. A routine (proc, template, macro, iterator, etc.) is callable. Variables are callable if they have the typeproc
. Types are also considered callable.Named parameters inside
args
lead to a further filtering step within overload solution. Then forC(f)
the overload resolution is performed as described in the manual.Module call
For a call expression
module.f(args)
first a set of candidates C(f) is created. Candidates are symbols in the current "module" that are callable. A routine (proc, template, macro, iterator, etc.) is callable. Variables are callable if they have the typeproc
.Note: In a first analysis step it is to be decided whether
module
can be interpreted as a module identifier. Variables can shadow a module name.Simple call with explicit generic instantiation
f[a](args)
Candidate set as before, but further filtered against generic routines which have the right number of generic parameters. Only if this set is empty, it is interpreted as array indexing followed by the call notation()
.module.f[a](args)
: Analogous to the "Module call" case.Object call
obj.f(args)
Condition:
obj
cannot be interpreted as a module identifier. Condition:obj.f
does not have a fieldf
or fieldf
is not of typeproc
.Rewrite rule:
obj.f(args)
-->f(obj, args)
Impossible:
obj.f[T](args)
, instead the notationobj.f[:T](args)
needs to be used.obj.f[:T](args)
is rewritten tof[T](obj, args)
.Object call without
()
obj.f
Condition:
obj
cannot be interpreted as a module identifier. Condition:obj.f
does not have a fieldf
or fieldf
is not of typeproc
.Rewrite rule:
obj.f
-->f(obj)
Array indexing without
()
x[args]
:x
is only converted to a symchoice if there is not variable (or let or const) of this name.Rule: Prefer generic instantiation if possible. Otherwise interpret the expression as
[](x, args)