nim-lang / RFCs

A repository for your Nim proposals.
137 stars 23 forks source link

Proposal for a `dotOperator` replacement #372

Open n0bra1n3r opened 3 years ago

n0bra1n3r commented 3 years ago

Description

Illustration

type Proxy = object

macro `=call`(obj: Proxy, op: untyped, args: varargs[typed]) =
  echo op.repr
  # implementation
  ...

let proxy = Proxy()

discard proxy.test # prints 'test'
proxy.test = 1 # prints 'test='
proxy.test += 1 # prints 'test+='
test(proxy, 1) # prints 'test'
discard proxy + 1 # prints '+'

Rationale

Definition of Terms

I use the term "routine" here to mean anything that's callable, including procs, operators, templates, and macros. I use the term "procedure" to refer to anything that has a function pointer like procs, funcs, and iterators.

Usecases

Interop

type CppObj {.importcpp header:"CppObj.h".} = object

template `=call`(obj: CppObj, op: untyped) =
  proc op(_: CppObj) {.importcpp.}

let cppObj = CppObj()
cppObj.cppMethod()

Note that this technique can also be used to call arbitrary functions from DLLs, allowing runtime interop and custom REPL-like functionality.

Field swizzling (GLSL-like field access)

type Vector3 = object
  x, y, z: float
type Vector2 = object
  x, y: float

macro `=call`(vec: Vector3, op: untyped) =
  let obj = nnkObjConstr.newTree(ident("Vector" & $len($op)))
  let fields = vec.getType[2]

  for i, field in $op:
    obj.add(newColonExpr(fields[i], newDotExpr(vec, ident($field))))

  template makeProc(op, obj) =
    proc op(vec: Vector3): auto = obj

  result = getAst makeProc(op, obj)

let vec = Vector3(x: 1, y: 2, z: 3)

echo vec.zy # prints '(x: 3.0, y: 2.0)'
echo vec.yzx # prints '(x: 2.0, y: 3.0, z: 1.0)'

A common usecase for doing shader- and graphics-related math.

Object proxies

type Proxy[T] = object
  obj: T

template `=call`[T](proxy: Proxy[T], op: untyped): untyped =
  proc op(p: Proxy[T]): auto {.inline.} =
    # intercept field access and/or procedure calls
    ...
    result = op(p.obj)

let proxy = Proxy[string](obj: "Hello")

echo $proxy # prints 'Hello'

This usecase could lessen the need for converters, since most operations can be forwarded through a proxy to the underlying object. This could also allow things like mapping field access to entity component access in an ECS.

Remote function invocation

type Network = object

template `=call`(obj: Network, op: untyped) =
  # perform operation to execute remote function or API
  ...

doStuffOverNetwork(Network())

This could be used to access arbitrary columns in a remote database for example, or execute an API with a nice syntax.

Postfix-like operators

type Obj = object

template `=call`(obj: Obj, op: untyped) =
  when astToStr(op)[-1] == '?':
    # generate proc to do nullable stuff
    ...
  else:
    ...

let obj = Obj()
let valueOrNil = obj.optionalField?.field

This allows e.g. Swift-like optional syntax without special-casing ?. to have the same precedence as ..

Extensions to computed properties

type Obj = object

template `=call`(obj: Obj, op: untyped, arg: int) =
  case astToStr(op)[-2, -1]:
  of "+=":
    ...
  of "-=":
    ...
  of "*=":
    ...

let obj = Obj()
obj.field += 1

Allows overriding operations on computed properties. Current Nim has field= and .=, which allows overriding only obj.field = 1, but not obj.field += 1.

Comparison with Existing Solutions

Proposed Mechanics

proxy.property = 1 # `property=` is an undeclared routine
# If `=call` is defined for `typeof(proxy)`, the above effectively expands to:
# `=call`(proxy, `property=`, 1)
# `property=`(proxy, 1)
undeclaredRoutine(proxy) # `undeclaredRoutine` is an undeclared routine
# If `=call` is defined for `typeof(proxy)`, the above effectively expands to:
# `=call`(proxy, undeclaredRoutine)
# undeclaredRoutine(proxy)
proxy + 1 # `+` is an undeclared routine
# If `=call` is defined for `typeof(proxy)`, the above effectively expands to:
# `=call`(proxy, `+`, 1)
# `+`(proxy, 1)
proxy.property += 1 # `property` is an undeclared routine
# If `=call` is defined for `typeof(proxy)`, the above effectively expands to:
# `=call`(proxy, `property+=`, 1)
# `property+=`(proxy, 1)

Related Literature

Dlang implements a similar concept for its operators, but instead of passing an AST to its "template methods", it passes a compile-time string containing the invoked operator. It is then up to the programmer to define and implement functionality for each possible operator passed to these templates.

Note that Dlang template methods actually expand to the implementation of a method plus the invocation of that method upon use, in the same way the proposed =call does.

In the current or upcoming Nim, there are features that may have similar mechanics to this one, namely:

Important Links

Varriount commented 3 years ago

So, just to confirm my interpretation:

n0bra1n3r commented 3 years ago

@Varriount Yes for both points.

The proposal also goes a bit further than your second point though. It tries to do away with special casing of . (or any other operator), and instead allows the programmer to generate a procedure if it is not declared when a call to it is encountered by the compiler. This call can be in any syntax that Nim supports (method call syntax, command invocation syntax, operator syntax, etc.). I tried to describe this under the Proposed Mechanics section a bit.

Please let me know if anything is not clear. I really want to make this a good RFC. Thanks!

Araq commented 3 years ago

The RFC is well-written but there is a fundamental design tension between "let's have custom dot-like operators" and "the behavior of the dot notation is overridable". I much prefer "custom dot-like operators" which rules out "Object Proxying" entirely, no matter the details of how it's done.

Having said that, a design should probably focus on allowing convenient user definable smart pointers.

n0bra1n3r commented 3 years ago

@Araq I see. So to clarify, object proxying is something you want to only be possible in specific scenarios (like for implementing smart pointers)? Or is it that object proxying is evil, full stop? The RFC was based on the idea that you could proxy types from other languages and perform operations on them just like you would with Nim types, without specifying every detail of the implementation.

I guess my gripe with dot operator overloading is that it changes the meaning of . (or any variation of it) from "just another way to call functions or to access object fields" to "a special way to generate arbitrary code, plus the other stuff". I'm not sure this RFC completely solves that either though, but at least the result of any . operation is guaranteed to be the result of a function call (or field access).

Varriount commented 3 years ago

The RFC is well-written but there is a fundamental design tension between "let's have custom dot-like operators" and "the behavior of the dot notation is overridable".

Another way to look at this might be: Term-rewriting macros are either not powerful enough, or not usable enough (from an ease-of use perspective) to apply to the expressions currently targeted by dot operators.

Araq commented 3 years ago

I guess my gripe with dot operator overloading is that it changes the meaning of . (or any variation of it) from "just another way to call functions or to access object fields" to "a special way to generate arbitrary code, plus the other stuff".

That's a good way to put it, here is another one: Turning a.b into a["b"] (dynamic access that can fail) is what I dislike most -- you get most of the problems of dynamic typing within Nim.

n0bra1n3r commented 3 years ago

you get most of the problems of dynamic typing within Nim.

Yes. I would dare say that any operator that has . in it (.?, .!, etc.) like what is proposed in RFC https://github.com/nim-lang/RFCs/issues/341 has this weakness. This RFC also carries some of those disadvantages, but I would go so far as to say that this one is superior to dot-like operators because of the limitations it imposes, as well as the flexibility it allows.

One more important advantage of this RFC (aside from what's already mentioned) over dot/dot-like operators is that operators can be declared in any module, separate from the type they operate on; =call in this RFC however would behave similarly to =destroy, which has to be declared in the same module as the type it operates on. This makes it clear that there is a certain amount of magic involved when working with a type that has =call.

Araq commented 3 years ago

=call is an alien beast though, the other type bound operators are lifted automtically, you define =copy for CustomObj and it's not skipped for a tuple of CustomObj, =call has no such lifting requirements.

n0bra1n3r commented 3 years ago

Well, can't argue against that... 😂 Sounds like a showstopper if enforcing same-module declaration for =call and its operand can't be implemented without hacks.

Araq commented 3 years ago

It can easily be implemented either way. But we can also easily enforce that dot operators must be reside in the same module as the type they belong to. Don't worry too much about the implementation, we should focus on getting the design right.