nim-lang / RFCs

A repository for your Nim proposals.
135 stars 26 forks source link

Unify function arguments and tuples #357

Closed n0bra1n3r closed 3 years ago

n0bra1n3r commented 3 years ago

What if the following were equivalent:

callProc(1, "2", 3.0)
callProc (1, "2", 3.0) # notice the space after `callProc`

The difference right now is that the first one calls a function with int, string, and float parameters, and the second calls a function with a tuple containing values of int, string, and float.

This proposal is basically this: What if function parameters are also tuples?

This takes some inspiration from Dlang's compile-time sequences.

I am not good enough to actually implement a PoC for this, and this is probably a huge change if it were to be implemented, so just putting this out there in case it is a reasonably good idea.

Usecases

A typical usecase would be to pass variadic arguments with different types.

proc variadicFn(args: tuple) = ...

variadicFn(1, "2", 3.0)

One could also pass named variadic arguments.

variadicFn(a1: 1, a2: "2", a3: 3.0) # ! may conflict with object initialization syntax

Note that one could mix variadic parameters with positional ones, the same as is possible today with varargs.

proc variadicFn(args: tuple, defaultArg: bool = false) = ...

variadicFn(a1: 1, a2: "2", a3: 3.0)
variadicFn(a1: 1, a2: "2", defaultArg=false)

In the next usecase, one can define what is essentially a variadic function with default arguments.

proc defaultArgFn(args: tuple = (1, "2", 3.0)) = ...

defaultArgFn()
defaultArgFn(3, 6, 9)

In following usecase, one can define a function that returns a tuple, which can then be used to pass arguments to any function that has an appropriate parameter list.

proc calculateArgs(): (int, string, float) = ...

proc randomFn(a1: int, a2: string, a3: float) = ...

calculateArgs().randomFn # UFCS syntax could be allowed

This could also used with C interop; it is functionally equivalent to C-style varargs.

proc variadicCFn(args: tuple) {.importc:"$1(@)".}

There could potentially be some interaction with concepts as well.

type
  NewConcept1 = concept
    proc fn(s: Self, a: int, b: float)

  NewConcept2 = concept
    proc fn(s: Self, c: string, b: float)

type
  ConformingObject = object

proc fn(s: Self, args: tuple) =
  for arg in args.fields:
    echo $arg

proc accepts1(obj: NewConcept1) =
  obj.fn(a: 1, b: 2.0) # ! slightly strange call syntax...
  # obj.fn(args=(1, 2.0)) # error

proc accepts2(obj: NewConcept2) =
  obj.fn("3", 2.0)

accepts1(ConformingObject())
accepts2(ConformingObject())

Variant initializer procs with compile-time checking:

import std/macros

type
  VariantKind = enum
    vkA,
    vkB

  Variant = object
    case kind: VariantKind
    of vkA:
      a, b, c: int
    of vkB:
      d, e: int
      f: string

macro apply(obj: var Variant, args: tuple) =
  template doAssignment(sym, args, field) =
    sym . field = args . field

  result = newStmtList()
  for exp in args.getTypeImpl:
    result.add(getAst doAssignment(obj, args, exp[0]))

proc initVariant(args: tuple): Variant =
  apply(result, args)

let variant = initVariant(kind: vkA, a: 1, b: 2, c: 3)
# let variant = initVariant(kind: vkA, a: 1, b: 2, e: 3) # error

Custom spread operators:

type Object = object
  a: int
  b: float

proc `...`(self: Object): tuple[a: int, b: float] =
  (self.a, self.b)

proc fn(a: int, b: float) = ...

let obj = Object(a: 1, b: 2.0)
fn(...obj)
Araq commented 3 years ago

Intriguing. But you need to flesh out the use cases more and please don't focus on today's varargs shortcomings, these are known and should be addressed by a different RFC.

n0bra1n3r commented 3 years ago

Got it, thanks for the encouragement @Araq. I'll flesh this out more over the next few days.

konsumlamm commented 3 years ago

What if the following were equivalent:

callProc(1, "2", 3.0)
callProc (1, "2", 3.0) # notice the space after `callProc`

This could lead to many ambiguities with overloading. Simple example:

proc callProc(x: (int, string, float)) = ...
proc callProc(x: int, y: string, z: float) = ...

And making them the same or changing overload resolution depending on whether there is whitespace after the function is a pretty bad solution imo.

I am not good enough to actually implement a PoC for this, and this is probably a huge change if it were to be implemented, so just putting this out there in case it is a reasonably good idea.

I'm not sure, but I definitely feel like this would break some code and in general just seems like a really bad idea imo.

A typical usecase would be to pass variadic arguments with different types.

Remember that Nim is statically typed, so all the types must be known at compile time (no, RTTI would not be a good solution). So "I can pass variadic arguments with different types" in itself is not a compelling argument, what's an actual usecase of this? Given that usually you want some common behaviour on the variadic arguments (either by making them the same type or by implementing a common interface), I think it would be a better idea to support concept varargs (varargs[SomeConcept]) or something similar.

In following usecase, one can define a function that returns a tuple, which can then be used to pass arguments to any function that has an appropriate parameter list.

proc calculateArgs(): (int, string, float) = ...

proc randomFn(a1: int, a2: string, a3: float) = ...

calculateArgs().randomFn # UFCS syntax could be allowed

This seems like it should rather be implemented as a macro, I imagine this behaviour to be quite confusing in the wild. (Or just make randomFn take a tuple as an argument...)

And probably the most controversial/crazy/unreasonable usecase would be to overload object initialization, allowing something like constructors.

type
  Object = object
    a: int
    b: float

proc Object(values: tuple[a: int, b: float]): Object =
  var obj = Object()
  obj.a = value.a
  obj.b = value.b
  # call some procs

var obj = Object(a: 1, b: 2.0)

You can already use named arguments, which look nicer (imo) anyway:

proc initObject(a: int, b: float): Object =
  result.a = a
  result.b = b
  # call some procs

var obj = initObject(a = 1, b = 2.0)
n0bra1n3r commented 3 years ago

And making them the same or changing overload resolution depending on whether there is whitespace after the function is a pretty bad solution imo.

I don't understand this comment, so want to clarify: the first sentence is showing what can be done in Nim today; it's the status quo, e.g. this works:

proc fn(args: tuple)

fn (1, 2, "3")

In your example, overload resolution doesn't have to be an issue, those two signatures could be made ambiguous by the compiler, which is what I'm proposing.

I'm not sure, but I definitely feel like this would break some code

I also think it would. The one case I can think of so far is your example though, and that pattern doesn't look like it would be used too often in the wild in my naive opinion.

Remember that Nim is statically typed, so all the types must be known at compile time

Almost every language I know that is statically typed (mostly the C family) that supports variadic arguments can have variadic arguments of different types. I have found many usecases for this specific capability in those languages and have taken it for granted (perhaps I should add the ones that apply to Nim to the proposal). It can be used anywhere where it makes sense to accept non-homogeneous sequences as parameters. In saying that, I like your idea of supporting concepts with varargs, or supporting typeclasses in general with varargs. varargs[auto] support comes to mind, which would make this proposal useless.

This seems like it should rather be implemented as a macro

True, and this is probably a bad example. I guess where I was going with this was that you could compute parameters for functions you don't own/maintain without wrapping them. I don't use this pattern myself, just wanted to show what could be done.

You can already use named arguments

Yes you can, and this one was a pretty wild idea; I don't think you can actually declare a proc with the same name as a type in Nim like I did in the example. What I wanted to show here was that object initialization could be overloaded in a way similar to constructors in C++, so you can enforce computation of the value of an object's fields at runtime when an object is initialized, instead of assuming people will use the initializer proc that you provide.

konsumlamm commented 3 years ago

And making them the same or changing overload resolution depending on whether there is whitespace after the function is a pretty bad solution imo.

I don't understand this comment, so want to clarify: the first sentence is showing what can be done in Nim today; it's the status quo, e.g. this works:

proc fn(args: tuple)

fn (1, 2, "3")

I know, let me clarify. Right now, this works:

proc callProc(x: (int, string, float)) = echo "first"
proc callProc(x: int, y: string, z: float) = echo "second"

callProc (1, "2", 3.0) # => first
callProc(1, "2", 3.0) # => second

With your proposal, I see two options:


That being said, an alternative I would prefer is having an explicit spread operator (that could possibly be implemented as a macro), similar to Python's * and **, though I'm not sure if that's a good idea either. It would allow (using Python's syntax)

proc fn(x, y, z: int) = discard

fn(*(a, b, c))            # -> fn(a, b, c)

var someTuple: (int, int, int)
fn(*someTuple)            # -> fn(someTuple[0], someTuple[1], someTuple[2])

fn(**(x: a, y: b, z: c))  # -> fn(x = a, y = b, z = c)

var someNamedTuple: tuple[x, y, z: int]
fn(**someNamedTuple)      # -> fn(x = someNamedTuple.x, y = someNamedTuple.y, z = someNamedTuple.z)

This would work with normal as well as vararg functions (varargs would still need to be extended to allow variadic generics and keyword varargs for your examples, but that'd be another RFC, my suggestion is purely syntactical). An advantage would be that it is explicit when tuples get unpacked:

callProc *(1, 2, 3) # takes varargs
callProc(1, 2, 3) # takes varargs
callProc (1, 2, 3) # takes tuple

However, I'd be careful with something like (inspired by Python's syntax)

# variadic genericsvarargs
proc tupleArgs(args: *tuple) = discard

# variadic generic keyword varargs
proc namedTupleArgs(args: **tuple) = discard

since that would supersede the existing varargs and would probably also bloat the executable size, since a new version of the function would be created for every amount of arguments (unlike for varargs which uses arrays instead of tuples). Instead, extending the existing varargs (or Nim in general) to support variadic generics and/or keyword arguments seems to be the better option to me. But at that point, why not just pass tuples as arguments (it's only two extra characters)?

n0bra1n3r commented 3 years ago

Jeez @konsumlamm, you're making so much sense. I like your proposal even better. varargs with typeclasses and those Python-style kwargs look even better to me, and would satisfy all my usecases. Plus they're purely additive changes, so no breakage of existing code.

I don't want to pollute the RFC issue pile, but I want to keep this open to keep these usecases documented for a future accepted RFC.

Araq commented 3 years ago

Ah so it is all about varargs, good to know. ;-)

n0bra1n3r commented 3 years ago

Yes, for my specific usecases it is. Unfortunately I could not come up with more general usecases, but I do know that Dlang has a concept similar to this (compile-time sequences), for better or for worse. Nonetheless I will continue updating this with the usecases I can come up with, even if just to document and help find an ideal solution. I don't really care if this specific RFC gets accepted if those who know better than me find a better way to do things.

timotheecour commented 3 years ago

conflating f(a,b) and f((a,b)) is bad (breaks valid use cases), RFC description should be updated to avoid mentioning that.

spread operator

the spread operator should be writeable by a macro, eg:

macro spread(..)
let t = (a,b)
f(t.spread) or f.spread(t) # one or both of those should be doable

C style variadic

nim supports C varargs for importc procs proc c_printf*(frmt: cstring): cint {.importc: "printf", header: "<stdio.h>", varargs, discardable.}

but IIRC doesnt' expose a way to write functions which accepts varargs in the C sense, like in printf; they have their use (eg when you need to a export a symbol in a DLL), even if it's usually not what you want.

this could be solved by porting https://en.cppreference.com/w/c/variadic

variadic with runtime type info

we could do the same as D for this, see https://dlang.org/articles/variadic-function-templates.html, in particular the The D Look Ma No Templates Solution; see also https://dlang.org/spec/function.html#variadic for more details.

https://dlang.org/spec/function.html#d_style_variadic_functions:

Two hidden arguments are passed to the function: void* _argptr TypeInfo[] _arguments

again, there are use cases (exporting a symbol to a DLL; can be very uesful in a debugger for eg)

single type variadic

nim has proc fn[T](a: varargs[T]) which allows variadic number of args of a fixed type

generic variadics

D and C++ have also have something nim doesn't have:

// in D:
void print(A...)(A a){
  foreach(t; a) writeln(t);
}

the closest thing in nim would be a macro or template taking varargs[typed] but that's obviously not the same. there's also this:

proc fn(a: tuple)

but the semantics are also different in particular for the way arguments are pased in codegen

n0bra1n3r commented 3 years ago

Thanks for this @timotheecour. I don't really understand your post though. Are these usecases related to this RFC, or are they things that shouldn't be in the RFC? The post seems to be about both? But your first statement pretty much invalidates the thesis of my proposal. Are you in support of implementing variadic generic parameters like D/C++/Typescript instead of my proposal? IMO implementing that would serve the same purpose as mine and is probably a better solution, but its semantics are much more complicated I think, so I would probably be a bad candidate to write an RFC for it.

About the spread operator, I've tried to implement f(t.spread), but I couldn't find a way to do that with current Nim. I believe there is already a macro in std that implements something like f.spread(t). However that doesn't allow some things that can be done with spread operators like f(...t1, t2) without making a mini-DSL for parameters.

Araq commented 3 years ago

What's wrong with varargs[typed] and then rewriting it inside a macro to whatever you need? It's a superior design IMHO.

n0bra1n3r commented 3 years ago

Honestly all I really wanted was a way to not drop down to templates/macros to take advantage of variadic parameters, since I find myself using them quite often. I do not have particularly strong feelings about this though. I thought bringing this idea up might pave a way to a solution that simplifies varargs, or spark some other ideas that might help fix some issues related to them, but obviously language design is way over my head.

Anyway I've come up with a nice solution to solve my immediate needs using macros, which allows me to not have to use macros for variadic functions. Just sharing here before I close this RFC:

macro `<-`*(lhs: untyped, rhs: typed): untyped = # syntax from ElegantBeef on discord
  result = lhs
  for child in rhs: # implement support for named parameters, etc
    result.add(child)

Usage is similar to a spread operator:

proc fn(a, b, c: int) = discard
fn() <- (1, 2, 3)
fn(a=1) <- [2, 3] # pretty cool IMO
# can also be used for anything that needs to copy children of one NimNode to another, like:
discard [] <- (1, 2, 3) # produces [1, 2, 3]

Or if you don't like that syntax, you can wrap it in a template, use with dotOperators, etc:

template fnWrapped(args: varargs[typed]) =
  fn() <- args

fnWrapped(1, 2, 3)

Thanks guys for the incredible feedback on this RFC! Hopefully I can improve and write better ones later on.

timotheecour commented 3 years ago

What's wrong with varargs[typed] and then rewriting it inside a macro to whatever you need?

Then there's the type-info based approach; while usually not what you want, it has its use cases, eg: