Dealing with redundant var/non var overloads

metagn commented 3 years ago

Problem: There is potential for too many overloads that change 1 type in the arguments and the result type from non-var to var but have the exact same code. Example:

https://github.com/nim-lang/Nim/blob/7ef706fef98563619a2554b71044030f5410d057/lib/pure/collections/deques.nim#L111-L136

This is ugly and there are so many procs you could do this for. I recently came across a problem where I had to find the maximum element of an array and then modify it, and what do you know, the max proc for openarrays in system can very well be overloaded for var with no modifications to the code. (also this max/min for openarrays should not be in system, it's not polished enough and its behavior is too fragile i.e. for empty arrays to be so accessible)

Union types for this instance do not work no matter how you arrange it, because of the sheer amount of layers that var T types have to go through and none of the layers let them pass. The best solution for this is a macro at the proc definition.

Proposal: A macro in sugar or something that works like this:

proc max*[T](x: openarray[T]): T {.overloadVar: x.} =
  result = x[0]
  for i in 1..high(x):
    if result < x[i]: result = x[i]

var s = @[1, 2, 3, 4, 6]
dec max(s)
doAssert s == @[1, 2, 3, 4, 5]

The 1 argument to overloadVar is similar to the var T from container from the RFC that had it, a resulting var type can only originate from a single argument, and if that RFC passes this macro wouldn't be very hard to modify.

There is one problem, this hides information from docs, max clearly supports var openarray[T] but if you ctrl+F'd the docs page you wouldn't find it. Thankfully there is kind of a workaround; the macro could also support this syntax:

proc max*[T](x: var openarray[T] | openarray[T]): (var T) | T {.overloadVar: x.} =
  result = x[0]
  for i in 1..high(x):
    if result < x[i]: result = x[i]

Surprisingly, this is valid syntax and valid Nim without the macro pragma, but again, it returns always T and not var T. It could be misleading to people who see it, they might interpret a var T | T result type as an actual Nim feature and not realize that a macro transforms it. But at least no information is being hidden from them, they can at least see the pragma list. You could make the result type just T and the macro would still translate it accordingly, but I don't think it'd be much better.

I've written a naive implementation here:

```nim import macros proc toNonVar(t: NimNode): NimNode = result = t if t.kind == nnkVarTy: result = t[0] elif t.kind == nnkInfix and (t[0].eqIdent"|" or t[0].eqIdent"or"): let a = t[1] let b = t[2] let aIsVarTy = a.kind == nnkVarTy let bIsVarTy = b.kind == nnkVarTy if aIsVarTy and bIsVarTy: result = newTree(nnkInfix, t[0], a[0], b[0]) elif aIsVarTy xor bIsVarTy: let (vt, nonvt) = if aIsVarTy: (a, b) else: (b, a) if vt[0] == nonvt: result = nonvt else: error("union of var and non-var types are not the same", t) macro overloadVar*(arg, prc): untyped = let nonVarResult = toNonVar(prc[3][0]) var argIndex: int var nonVarArg: NimNode for i in 1..

Backwards compatibility: Since it's a macro, no backwards compatibility problems? I don't think it would be a problem if old procs in the standard library were changed to this, but their documentation would definitely take up 1 less proc definition.

mratsim commented 3 years ago

I use template fooImpl

template maxImpl[T](x: openarray[T]): T =
  result = x[0]
  for i in 1..high(x):
    if result < x[i]: result = x[i]

proc max*[T](x: openarray[T]): T =
  maxImpl(x)

proc max*[T](x: var openarray[T]): var T =
  maxImpl(x)

Araq commented 3 years ago

There is a far better solution for this on the horizon thanks to borrow checking. We should distinguish between:


proc `[]`(x: Container): var T # Note: No 'var Container' here!

And


proc mgetOrPut(x: var Container): var T

Then procs can allow for potential mutations by the caller but it's clear that the proc itself does not mutate. This models precisely how the builtin array/seq a[i] works, in a[i] = x the mutation is done by the assignment but a[i] allows for it by returning an address.

If the result of a var-T-proc is used for mutation, we check that the first argument to the proc was mutable too, so that things like let x = ...; p(x) = y remain to be invalid.

cooldome commented 3 years ago

Araq, proposed what I had in mind

metagn commented 3 years ago

Seems resolved? Doesn't seem like there'll be a problem with this in the future, closing

nim-lang / RFCs

Dealing with redundant var/non var overloads #259