Operator overload in separate modules causes ambiguous-call when used.

BryantLam commented 6 years ago

Summary of Problem

Operator overloaded in two separate modules results in ambiguous-call error:

error: ambiguous call '+(R, R)'
note: candidates are: +(lhs: R, rhs: R)
note:                 +(lhs: R, rhs: R)

module Core {
  record R { var x: int; }
}
module A {
  use Core;
  proc +(lhs: R, rhs: R): R {
    return new R(lhs.x + rhs.x + 1);
  }
}
module B {
  use Core;
  proc +(lhs: R, rhs: R): R {
    return new R(lhs.x + rhs.x + 2);
  }
}

proc main() {
  use A, B;
  var a = new R(1);
  var b = new R(2);
  var r = a + b;
  writeln(r.x);
}

Comments

This error message is ambiguous to read, but at least it's precise as to the line numbers.
How do I select the operator I want to use? I can't module-prefix my operator to disambiguate the operator. (Or can you, but I can't find the syntax.)
How do I exclude the operator? use B except + doesn't work. If I need both modules, I can't avoid the operator conflict.

Configuration Information

chpl version 1.18.0 pre-release (f18926d825)

lydia-duncan commented 6 years ago

What caused you to run into this issue? I would expect operators conflicts like this to be fairly rare in practice

lydia-duncan commented 6 years ago

It definitely should be fixed either way

mppf commented 6 years ago

@lydia-duncan - I think you're saying that use B except + should work / be fixed.

How do you feel about supporting a B.+ b ? I'm leaning against it, mostly because it looks hard to read / understand, but it might be a reasonable thing to do in theory.

lydia-duncan commented 6 years ago

I think there should be some way to limit which one gets brought in, but I also am uncertain about the syntax. I don't know about a B.+ b but I think at least allowing var z = B.+(a, b); would be a reasonable workaround

BryantLam commented 6 years ago

My usage of the module keyword hides the underlying idea: if these modules were top-level packages developed independently, there's no way in Chapel today to disambiguate the call.

The syntax choice/module prefix/namespace (probably B.+(args)) needs to be reflected in the ambiguous-call error message.
use B except + should work to exclude one of the overloads when both packages are needed. I'm open to alternative syntax, but this one seems to be the most obvious given that operator overloading is defined that way.
var z = B.+(a, b) is an acceptable solution. This is the route that C++ took: auto z = B::operator+(a, b). The choice of syntax here doesn't have to be beautiful; it just has to be obvious and I think this one is obvious, especially if the ambiguous-call error message from no.1 is the same.

BryantLam commented 6 years ago

use B except + isn't specific enough to exclude a specific overload for +; it isn't named. Naming it would be one solution, though simply specifying the exact candidate would work too:

use B except +(R, R)

I think this would be an acceptable solution since (1) I expect the cases of conflict to be rare, (2) it mimics the candidate emitted from the warning, and (3) there aren't any operators that take a variadic number of arguments that would lead to a difficult-to-read use statement.

lydia-duncan commented 6 years ago

I feel like that argues more for + being defined as a method on R, which is something we've talked about in the past.

I can see the solution you've proposed being necessary in the current world, but I'm not sure what priority I would place on implementing it - this seems like a very rare occurrence.

BryantLam commented 6 years ago

I feel like that argues more for + being defined as a method on R, which is something we've talked about in the past.

I agree with this statement simply because that's what should have happened with the original developer that created module Core, but instead two separate developers chose to do it instead when they implemented A and B.

I could also see a case where Core.R had some basic functionality that was then extended with two different e.g., constrained-generic implements in A and B respectively, which then affected the definition of A.+ and B.+. There would need to be a way to disambiguate between Core.+, A.+, and optionally B.+.

I do think that this would be a rare occurrence, but it does affect scalability of packages, especially without some way around this issue for the end user except for manually commenting out one of the conflicting overloads.

mppf commented 6 years ago

I think whether operators can be methods or not is related but can be viewed as an independent choice.

Regarding the except list with a specific type included, another way to handle this might be based upon overload sets (which is a concept from D). Many of the "hijacking" examples in CHIP 20 can be resolved through the use of overload sets.

The general idea of overload sets, quoted from CHIP 20:

What is this overload sets idea?

an "overload sets" is a group of functions with the same name declared in the same scope.

perform overload resolution independently on each overload set

no match in any overload set, then error

match in one overload set, OK

match in multiple overload sets, error

introduce a language mechanism to opt-in to merging overload sets. In D, this is the 'alias' keyword.

Presumably module A and module B's overloads of + would be in different overload sets. The result would be an error in main (as there is today). The difference is that the overload sets idea could include a separate mechanism to say that certain overloads should be merged or should be ignored. (In the D version it includes only the ability to merge overload sets).

mppf commented 5 years ago

See also the overload sets issue #12635

mppf commented 5 years ago

I don't see how making operator overloads methods would help at all. Am I missing something? If they were methods, modules A and B would define a method on R with the same name, right? Which would lead to ambiguity again?

So far there seems to be some agreement on:

B.+(a, b)

rather than

a B.+ b

But, this wouldn't help with a method, right?. Suppose it was R.foo that was defined in both modules.

a.foo(); // ambiguous
a.(B.foo)(); // could this work?
B.foo(a); // assuming universal method calls
B.foo(this=a); // could this work?

BryantLam commented 5 years ago

But, this wouldn't help with a method, right?

This is a valid criticism. It might be better to just provide some mechanism for the user to override on at the use-statement definition.

lydia-duncan commented 5 years ago

I don't see how making operator overloads methods would help at all. Am I missing something? If they were methods, modules A and B would define a method on R with the same name, right? Which would lead to ambiguity again?

Refreshing my memory on the issue, I believe what I was trying to imply was that we would then be able to do something like use B except R.+ or use B except R (the former not being valid syntax today but something we could consider, and the latter if R.+ was the only method on R defined in B) to enable the user to exclude only the + defined on R rather than forcing them to exclude all + that would otherwise be made available by that module.

mppf commented 5 years ago

@lydia-duncan - that seems odd to me, since method-ness in this case is just enabling some other syntax.

Can you do use B except R.method today? (I think no?)

Wouldn't it be equally effective to be able to write use B except +(R, R) if we wanted to exclude that specific function? How is that really different from use B except R.+ ? Other than syntax?

My point here is that I don't think we should decide to make operators into methods to support use ... except patterns because there are other ways to support the same patterns.

lydia-duncan commented 5 years ago

Can you do use B except R.method today? (I think no?)

I did say we couldn't do that today :)

Wouldn't it be equally effective to be able to write use B except +(R, R) if we wanted to exclude that specific function? How is that really different from use B except R.+ ? Other than syntax?

That also involves putting information about argument types (and maybe eventually return types) into the use ... except syntax, which is something that there isn't precedence for, so far as I know, which will increase complexity for the use statement analysis. Whereas having it associated as a method on a type and labeled due to that interaction is a lot more similar to the "classes/records are like modules but with multiple instances!" comparison - structure.symbolName is the same pattern in both places (so syntax, like you said). But that whole thing will get tricky with inheritance (if you banned R.foo and S inherits from R, I would assume that means S also can't use foo unless it was defined outside of the use statement . . . it's a can of worms, which is part of why we haven't implemented class.method explicit banning and I'm still concerned about how use B except R even works when R is a type, which you could use today to ban + if + was a method on R . . .)

Sorry, that's pretty rambley. I don't know that we need to follow this train of thought further, I think the overload sets comment is probably the best way forward

chapel-lang / chapel