Proposal: Reversed operator overload

ChayimFriedman2 commented 4 years ago

Edit: TL;DR

I've added a keyword reverse that let you overload the reverse operator (e.g. 1 * MyClass). Also, I've added a class NotImplemented Finally, I've allowed the syntax instance.[reverse ]binary_operator(arg) (e.g. this.*(1) or other.reverse +("Cool!")) that calls a binary operator directly (without going through the reverse semantics - that is, if the operator returns NotImplemented this will be its return value. Normally operators never return NotImplemented to the user - the VM consumes it). Its usage is to prevent duplicate code by implementing reverse operator via the straight one. The proposal is still slightly incomplete; you can follow the discussion below. I'll update here once it has been done.

Original version

When I first saw #746, I thought about Python.

In short, that issue talks about commutative operator overloading; that is, the ability to overload not just <myclass> + <otherclass> but also <otherclass> + <myclass>. This is intuitive to anyone for numerous operators: +, *, etc. on numeric types and more (e.g. if I would want to implement a complex number class in Wren).

Unfortunately, the strategy used in Wren for operator overloading is simple: check if the left operand has a method named same as the operator, and call it with the right operand. This makes it impossible to overload operators commutatively.

Python solved this in two steps:

Define a constant NotImplemented, which any overloaded binary operator should return when there is type mismatch; in the example of the above issue, Num does not know how to multiply itself with List, so it'll return NotImplemented.
Define a reverse operator, __r<operator>__, for each binary operator. If the method of the left operand returned NotImplemented and there is a reverse operator method on the right operand call it with the left operand as parameter. If it returned NotImplemented too, raise an error.

Wren can implement a similar strategy too. We can define a static method Object.notImplemented. About the second step, I have two ideas:

Just like Python, add reverse operator semantics. Something like a reverse keyword, e.g. reverse *(other). This has the disadvantage of adding a keyword and requiring another method for the same operator (usually it'll just return this * other).
Define a specific set of operators as commutative, and automatically call them as reverse operators if needed. I do not like this approach, because although most of the time there's a constant set of commutative operators, there can still be cases where I want to overload both the reverse and the straight version of the operator. For example, if I'll continue with the complex numbers example from above, I'll want to be able to do both complex / irrational and irrational / complex, even though they have different meanings.

mhermier commented 4 years ago

This is a huge can of worm, with no real easy solution. The solution you propose is valid, and try to solve the not having control of the source. Another solution to this problem could be to allow to expand classes outside of primary class definition, or granting an operator to be commutative on some class.

ChayimFriedman2 commented 4 years ago

@mhermier I liked the second solution, but I think the first breaks the nature of Wren. It'll also make it much less performant.

mhermier commented 4 years ago

Elaborate how it can be slow? If you craft an opcode to swap arguments on the stack, and call the operator again (and given it can be re-used per operator, I don't see how it can be a penalty compared to do a full round. First option implies, trying to do a call, asking querying the class, and then perform the same evaluation as on previous definition.

ChayimFriedman2 commented 4 years ago

I don't understand. Do you mean the following?

class Complex {
  construct fromNum() { /* ... */ }

  *(other) {
    if (other is Num) {
      return this * Complex.fromNum(other)
    }
    // ...
  }
  multiplyNum_(other) {
    // `this` is `Num`
    if (other is Complex) {
      return other * this
    }
    return Num.originalNumMultiply_(other)
}
Num.originalNumMultiply_ = Num.*
Num.* = Complex.multiplyNum_

mhermier commented 4 years ago

In essence yes, maybe not exactly that syntax. Or something like this:

class Complex {
}

Complex.* = Fn.new {|lhs rhs|...}

But details have to be thought, because this involves overload name resolution which is another can of worms that need to be think of...

ChayimFriedman2 commented 4 years ago

I thought about block functions, but rejected it because they requires .call().

Except the boilerplate (which is huge, as you can see in my example) and the compromise of readability (again), this will make the VM slower because the speed of Wren comes mostly from the speed of method/fields location - they're known at compile-time. If we'll be able to change existing class structure (like other dynamic languages), we'll loss this advantage. Field and method access will need to perform a lookup in a dictionary, and it'll slower the VM in factors (I can guess more than 2x).

mhermier commented 4 years ago

Behind the scene it is a linear search, so it is only a matter of expanding an existing class, so speeds as this is not really an issue (as in current implementation). The only real issue is control where we allow expanding structs, so we don't get some oddities/corruption.

About the .call() issue, the argument is not a real issue. Since methods/foreign are not first class object (yet), it can state that they follow .call() protocol

ChayimFriedman2 commented 4 years ago

Behind the scene it is a linear search, so it is only a matter of expanding an existing class, so speeds as this is not really an issue (as in current implementation).

Linear search? What do you mean?

The only real issue is control where we allow expanding structs, so we don't get some oddities/corruption.

structs in Wren?

About the .call() issue, the argument is not a real issue. Since methods/foreign are not first class object (yet), it can state that they follow .call() protocol

If they were first-class they could implement a protocol, but now when they're not?!

Can you not put too many newlines please? It is hard to read...

mhermier commented 4 years ago

Classes holds methods in a linear and predictable way in the current implementation. Meaning if you define a method name foo(), it will get an index assigned, that will be available forever and every class during the lifetime of the VM. So expanding an old class to provide foo(), is only expanding the array of methods array so it can contains the index for foo() and assign the method at it.

ChayimFriedman2 commented 4 years ago

Yes. But the index is provided at compile time, and with that it'll have to be generated at run time. Take:

class A {
  f() {
    System.print(this.f2()) // `this` is required, or the compiler won't know where `f2()` is defined
  }
}
A.f2 = A.f // Of course infinite recursion, just for demonstration

When compiling, the compiler won't know how to access A.f2(), so it'll need to leave it as-is. And the interpreter will have to perform a string lookup.

ChayimFriedman2 commented 4 years ago

And also this will be asymmetric and confusing, since it won't allow getters/unary operators/setters/static functions.

Also the compiler will need to accept operators as class methods (Class.+). Minor change, though.

mhermier commented 4 years ago

There is no such error, because the compiler is dumb. Whenever it finds and unknown methods, it creates an entry. So as in current state, it should be valid given method assignment works.

ChayimFriedman2 commented 4 years ago

Also, now this outside of a method is a compile-time error. It will have to be a run-time error.

ChayimFriedman2 commented 4 years ago

There is no such error, because the compiler is dumb. Whenever it finds and unknown methods, it creates an entry. So as in current state, it should be valid given method assignment works.

OK, although that seems to be a bug in the compiler (accessing an undefined method will not raise an error).

mhermier commented 4 years ago

This starts to be really be out of control, adding too much work around. It make the function approach less intrusive.

ChayimFriedman2 commented 4 years ago

There is no such error, because the compiler is dumb. Whenever it finds and unknown methods, it creates an entry. So as in current state, it should be valid given method assignment works.

But there is still problem - consider:

class C {
}
class Helper {
  f1() {
    this.f2()
  }
  f2() {
  }
}
C.f1 = Helper.f1
C.f2 = Helper.f2

mhermier commented 4 years ago

It would be a compiler error, if a more advance type checking was in place. Currently type checking is in-existent at compile time, and is only done at runtime.

ChayimFriedman2 commented 4 years ago

This starts to be really be out of control, adding too much work around.

I absolutely agree. As I stated above, this is a complete change to the design of the language. Massive changes to the language requires massive changes to the implementation, naturally. This is the reason I rejected this idea.

ChayimFriedman2 commented 4 years ago

It would be a compiler error, if a more advance type checking was in place. Currently type checking is in-existent at compile time, and is only done at runtime.

Unfortunately, you're right. This is an inherent limitation of single-phase compilers.

mhermier commented 4 years ago

Your example is demonstrates why methods should not be shared between classes. And mostly make it only valid to use function instead.

ChayimFriedman2 commented 4 years ago

But functions should have .call() as I said, and also suffer from the same issue:

class C {
}
C.f1 = Fn.new {
  this.f2()
}
C.f2 = Fn.new {
}

ChayimFriedman2 commented 4 years ago

I also suspect there are bunch of issue we haven't discovered. It's better to avoid this idea completely, especially since there are excellent alternatives.

mhermier commented 4 years ago

No, this is an abuse/dirty hack. You have to consider this as the first argument of the function, like what you would do with method introspection, and you are done.

ChayimFriedman2 commented 4 years ago

And what about .call()? And the issue in my example?

ChayimFriedman2 commented 4 years ago

And where is the dirty hack here?

mhermier commented 4 years ago

Example my way, that don't require a grammar hack so this is the real object not the method:

class C {
}
C.f1 = Fn.new {|self|
  self.f2()
}
C.f2 = Fn.new {
}

ChayimFriedman2 commented 4 years ago

OK. But I think this is confusing and we all better with the previous ideas. Your thoughts?

mhermier commented 4 years ago

My thought is that has I previously stated, this is an expected mess. And your syntax, with this in the function, is pretty confusing, because this could also represent the function object itself. Introducing a new syntax for free methods could also be a solution. But each of the solutions seems to have pretty huge cons, mostly because it requires type inference.

ChayimFriedman2 commented 4 years ago

I'm talking about adding reverse or reversable keyword, to indicate reverse operator or operator that can be applied in reverse, accordingly.

mhermier commented 4 years ago

Grammatically wise it seems to be the less intrusive, thought it might requires to adapt a bunch of code to detect the rhs argument type. It also means that the reverse call is opt-in by the original caller, which can also be a minor problem in 3rd party code where maintainer is not willing to be compliant. Implementation wise it might be a mess. Either you return the error by return value or by exception, first implementation being a little bit less worse than the other, it will introduce a conditional jmp (and creation of fibers in the second case) in assembly on every potential operator call (same problem with the assign operator proposal). Additionally this would require a new singleton so it can be used inside the VM or that we introduce VM error enum and its bindings.

mhermier commented 4 years ago

Side note about the new lines: I'm sorry, but I was replying over emails on the phone, and it seems that gmail or github add spurious new lines when adding comments.

ChayimFriedman2 commented 4 years ago

Do you have better suggestions?

mhermier commented 4 years ago

Not really, I'm just saying, before implementation that it will have quite some bytecode cost, even creating a dedicated opcode for it will have possibly high cost. The thing is that problem you are trying to solve is a common but artificial problem caused by code distribution. Since all the source must be feed, you have full control of what you feed and can correct broken operator/code.

ChayimFriedman2 commented 4 years ago

But you agree that the current state isn't good.

mhermier commented 4 years ago

It is not good or bad. For my use case, I consider that I have access to all the code, and I can fix any incorrect behaviour already. For your use case, where you want to use external code, this is a requirement if you don't want to alter external code. But what ideally I really would like to avoid is to pay for feature that might be barely used.

mhermier commented 4 years ago

I meant nearly used in a hot path. There are change that can be considered good while heavy, if in the cold path. The problem with that change and the assign operator changes is that it will end up in a hot path at 99% of use cases.

ChayimFriedman2 commented 4 years ago

The biggest problem here is the inconsistency. A beginner can spend a lot of hours trying to understand why [1] * 3 is [1, 1, 1] while 3 * [1] is an error. This is not what users expect from their language.

ChayimFriedman2 commented 4 years ago

Unfortunately you're right about the downgrade in the performance. I will think about it a bit more trying to find a solution, but one way or the other I think we should implement it. Performance is the second most important feature of a language, but first and foremost we have the language itself - its semantics, syntax and what users want it to be.

ChayimFriedman2 commented 4 years ago

I thought about it a little bit more, and I think you're wrong. This seems to be on the hot path, but it's definitely not.

See: the only impact of this will be in the case of invalid operator. That is, if we do 3 * 3 the VM will immediately call Num.*(3, 3) and everything is fine.

The problem starts where we do 3 * [1]. Previously that was an error (the Num.*() method detects the use of a non-number as the right operand and call Fiber.abort()). Now it'll call List.*([1], 3). Clearly, errors are a very cold path. Many languages use this fact: V8, AFAIK re-parse the source code with location information (row and column) where there's need to get a stack trace, and so it can skip this step in the first parsing, saving memory and computing power in the regular case. Take this as an example. Instead of emitting an error immediately, we'll have a delay. But it's definitely worth it.

Also, we will gain a small (or not-so-small) benefit: clearer error messages. Don't look at me like that, error messages are very important! Look at this small example:

class C {
  construct new(f) {
    _f = f
  }
  f {
    return _f
  }
  *(other) {
    return C.new(f * other.f)
  }
}

var c = C.new(5)
System.print(c * 5) // Cryptic error message: "Num does not implement 'f'". Clarification, please?

Of course we can check for non-C parameter in C.*(), but we won't. This is how users write programs.

Now they'll have to:

class C {
  construct new(f) {
    _f = f
  }
  f {
    return _f
  }
  *(other) {
    if (!(other is C)) {
      return System.notImplemented
    }
    return C.new(f * other.f)
  }
}

var c = C.new(5)
System.print(c * 5) // "Unsupported operand types for '*': 'Num' and 'C'"

mhermier commented 4 years ago

About your first arguments, it is true that it can be disappointing. But there only 3 ways to fix it:

You fix it for yourself.
You can fix it and you propose it to the maintainer.
You fix it by using a language feature so you don't have to ask the maintainer of other code. The exact example you are giving is an example of what should be in the second way. But what you are trying to do is to push the 3rd way, by bringing a language feature. Each of these way have interests, and 1st and 2nd usually goes in pair usually you do the 1st and when you are confident you do 2nd. I agree the 3rd way has various interest. But it is not also a universal solution, like subclassing/separation... About the performance degradation, I considered even creating a specific opcode, it can mitigate the problem by joing call and conditional jump in one opcode (or even fully integrate it in one opcode). But at the end of the day it is still a band aid, and just optimise the issue, not really solve it. Maybe that one can be tested, but I don't really have high hopes because of the complexity it brings to do the error management, and you also have to garantee there are no abuse of the error or unexpected propagation of that error and provoke erroneous behaviour. I don't say it can be worth the effort, but it has to be tested and validated.

mhermier commented 4 years ago

You are right, error management should go in the cold path, but how do you handle the error in the hot path? You must have to alter the hot path in a way that it is prepared to receive the error. And this is where lies the issue. There are some cheaper way to do it, but nothing is cheaper than having nothing to do. In static language this cost is lifted at compile time, in dynamic language this cost is resolved at runtime and can become really expensive.

ChayimFriedman2 commented 4 years ago

This way already exists. Built-in classes check their operand. For others, the impact is not so great.

mhermier commented 4 years ago

For your specific issue, you can do it the same way, without the need to introduce the heavy machinery required by what you originally proposed. This machinery is useful but you have to introduce it with care and awareness.

ChayimFriedman2 commented 4 years ago

I have no specific issue. I've encountered #746 accidentally. And I think that just like there the user got confused from this behavior, many will too.

mhermier commented 4 years ago

A reason way I'm a bit reluctant about the implementation you proposed, consider the following code:

for (mat in matrices) {
  norm_mat.append(mat.norm() * mat)
}

Your assumption that it is a slow path is thrown away, and if you have gazillion of matrices, the cold path is the hot path. I agree this code is a little bit artificial, but you have to consider it, in its generality, and the probability that the cold path become hot is very likely if generalised...

ChayimFriedman2 commented 4 years ago

Where are the performance hits in your example?

mhermier commented 4 years ago

mat.norm() * mat is the only operator. I agree that my example might be poorly chosen because of the matrix allocation that might be the slowest part.

But lets consider implementation of reverse operator using exception. To perform it, the following logic has to be done (in pseudo code):

{
  fiber = Fiber.new()
  push result of mat.norm() on top of fiber
  push mat on top of fiber
  fiber.try("operator")
  if (!fiber.hasError() || fiber.error != NotImplemented) return fiber.value in current fiber
  swap arguments on current fiber
  call("reverse operator")
}

This pseudo code already optimize second call by using current fiber. But the first fiber creation on hot path is almost a requirement, unless you use some heavy hackery on the current fiber. Now imagine that you have to do this on every binary operator in existence... because you'll never know when a reverse operator will occur...

Even if you can reduce/remove some of the code by doing a dedicated opcode and hackery, there is a bunch of extra logic to be added in the hot path.

ChayimFriedman2 commented 4 years ago

You don't have to raise an error. You can just return value. Then your code become:

{
  push result of mat.norm()
  push mat
  call("operator")
  if (stack top == NotImplemented)
  {
    pop
    call("reverse operator")
    if (stack top == NotImplemented)
    {
      // pop here?
      Fiber.abort("Unsupported operand types for '*': 'Mat' and 'Mat'")
    }
  }
}

Which is essentially two more CPU instructions for the hot path (cmp stack_top, NotImplemented and jnz after_if) in x86. Probably ARM will have three because we'll have to calculate the stack top, but I'm not sure. Anyway, this is not a big deal.

mhermier commented 4 years ago

Well due to how stack works, there is more stack manipulation than that to do. It is a little bit worse than that, since return will remove the first pushes, you have to dup the stack values and restore the stack to propagate the result in the hot path, in cold path you only need to pop and swap the arguments. These are quite cheap in a single usage, I agree. But consider even a single matrix manipulation or statistics, and also the damage this change does to regular math when cumulated, with all the preamble added to the hot path ???

ChayimFriedman2 commented 4 years ago

Hacky, maybe, but a single DUP_2_TOPMOST bytecode instruction will solve the problem. And as I said, I think it's mandatory: first and foremost the users, and just then the performance.

wren-lang / wren

Proposal: Reversed operator overload #793

Edit: TL;DR

Original version