chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.77k stars 415 forks source link

When should the compiler generate default function implementations? #20484

Open benharsh opened 2 years ago

benharsh commented 2 years ago

The language specification (as of 1.27) currently states the following concerning comparison operators on records:

Default functions to overload comparison operators are defined for records if none are explicitly defined.

What do we mean by "explicitly defined"? I will be using the comparison operator == in examples here, but I intend this question to apply to other default functions.

Consider the following program:

record R {
  type T;
  var x : T;
}

proc main() {
  var A : R(int);
  var B : R(int);
  var C = (A == B);
  writeln(C); // true

  var X : R(real);
  var Y : R(real);
  var Z = (X == Y);
  writeln(Z); // true
}

This program resolves because the compiler generates a default implementation for the comparison operator == because one is not defined.

Depending on how I define a comparison operator == the compiler may or may not generate a default implementation. This is certainly influenced by some internal compiler architecture, but it's not clear what the rules should be either.

For example, the following function will not prevent a compiler-generated default. The compiler tries to reason about this function before type resolution, and so doesn't recognize that this applies to an instance of R:

operator ==(lhs : R(int), rhs : R(int))

A function using a where-clause will prevent a compiler-generated default, because the compiler recognizes the types of the arguments as R.

operator ==(lhs : R, rhs : R) where lhs.T == int && rhs.T == int

What sorts of operator signatures should prevent a compiler-generated default?

Should we depend on whether an operator could be resolved for a given type?

Should we depend on whether an operator was declared as a method (e.g. R.==) ?

Do we want our decision here to prompt users to think more carefully about their type's design? E.g. if they provide one constrained overload, should we require them to provide their own catch-all implementation? Sometimes it may be prudent to provide specific overloads for more complicated fields (e.g. owned/shared, arrays), which doesn't necessarily indicate that the user wants to abandon the compiler-default in other cases. This touches upon an ongoing topic of whether users should be able to request the compiler-generated default explicitly.

benharsh commented 2 years ago

Our answer here will influence how we implement things in dyno.

mppf commented 2 years ago

I don't yet have answers to all of the questions above, but I just wanted to note that I think it's really important that the compiler decide if the default function exists or not from the code in the module defining the type only. Also, I think it's important that user-defined replacements for the default functions can only be defined in that module. In other words, tertiary methods for these should have no impact whatsoever (except for probably leading to a compilation error about how they don't do anything).

mppf commented 2 years ago

I wonder if we could insist that the operators that can have defaults be declared as operator methods. The operator R.== syntax should make it much clearer that it's R's == and that the compiler shouldn't make a == for R.

bradcray commented 2 years ago

I don't yet have answers to all of the questions above, but I just wanted to note that I think it's really important that the compiler decide if the default function exists or not from the code in the module defining the type only. Also, I think it's important that user-defined replacements for the default functions can only be defined in that module. In other words, tertiary methods for these should have no impact whatsoever (except for probably leading to a compilation error about how they don't do anything).

I definitely agree with this and worry that the interpretation of Chapel programs would be too confusing and haphazard if we did otherwise. Even if it turns out to be overly restrictive or limiting over time, it seems like the right place to start since it's restricting.

Should we depend on whether an operator was declared as a method (e.g. R.==) ? I wonder if we could insist that the operators that can have defaults be declared as operator methods. The operator R.== syntax should make it much clearer that it's R's == and that the compiler shouldn't make a == for R.

These statements also resonate with me, but with caveats. For one, if I define an operator R.==(x: R, y: int) it's not obvious to me that the compiler shouldn't generate an operator R.==(x: R, y: R) version for me, nor that I should have to write that out longhand myself if the default would've worked just fine for me /been what I want. It also seems to me that if I were to define operator ==(x: R, y: R) and the compiler were to generate a operator R.==(x: R, y: R) because mine wasn't attached to R that I'd get an ambiguity, which seems unfortunate. Or if the compiler-generated one simply "won", then I'd be confused. All that said, if the ambiguity were to say "Maybe you should define your operator as a method?" maybe that's not so bad.

Back on the first point, we've always talked about whether there should be a way to opt into the default initializer (say) even if you've defined your own custom initializers because it's so powerful and convenient. If we had such a mechanism, we could use it here as well. But I think the challenge in both cases is that we don't have any candidate syntax in mind that we like...

benharsh commented 2 years ago

Have we stabilized on the way that we want to resolve operator calls? Should I create a separate issue to discuss that question? Or are the two topics sufficiently intertwined that this issue is sufficient?

lydia-duncan commented 2 years ago

We discussed operator call resolution pretty thoroughly when we started using the operator keyword. The conclusion we came to was mostly centered around their visibility (operator functions are only made more available by a use or import, while primary and secondary operator methods are visible everywhere an instance of the type can be obtained). I believe as a result I made any generated operators be operator methods, for accessibility. Operator methods and standalone operators are otherwise expected to be interchangeable and to conflict with each other when both exist. With that in mind, I believe that the presence of a standalone operator with the same arguments as the default in the same scope as the type should cause the default version of that operator not to be generated - otherwise, the code in the module with the type would get different behavior than code in a module that happened to get an instance but did not use the module with the explicit operator

The assertion Michael made about default functions should already be what we have in the production compiler for initializers, and I think it's reasonable to extend that to other default functions.

benharsh commented 2 years ago

With that in mind, I believe that the presence of a standalone operator with the same arguments as the default in the same scope as the type should cause the default version of that operator not to be generated

Do you have a stance on whether where-clauses would impact the choice to generate a default function? From the original example above, should the following operator allow or prevent a compiler-generated default?

operator ==(lhs : R, rhs : R) where lhs.T == int && rhs.T == int
bradcray commented 2 years ago

Do you have a stance on whether where-clauses would impact the choice to generate a default function?

Not that you were necessarily asking me, but my instinct would be that the presence of a where clause shouldn't change what we do relative to not having that where clause. That is, if the operator above would thwart the compiler-generated default without the where clause, it should with it as well (and vice-versa).

mppf commented 2 years ago

@lydia-duncan

Operator methods and standalone operators are otherwise expected to be interchangeable and to conflict with each other when both exist.

That's not what this code

https://github.com/chapel-lang/chapel/blob/dc228b96ba07bc5ad9a6da20bd4ee84a3e07c940/compiler/resolution/functionResolution.cpp#L6072-L6084

looks to me like it's doing. (Not that we have to fix it in this issue, just noting).

mppf commented 2 years ago

Should we depend on whether an operator was declared as a method (e.g. R.==) ? I wonder if we could insist that the operators that can have defaults be declared as operator methods. The operator R.== syntax should make it much clearer that it's R's == and that the compiler shouldn't make a == for R.

These statements also resonate with me, but with caveats. For one, if I define an operator R.==(x: R, y: int) it's not obvious to me that the compiler shouldn't generate an operator R.==(x: R, y: R) version for me, nor that I should have to write that out longhand myself if the default would've worked just fine for me /been what I want. It also seems to me that if I were to define operator ==(x: R, y: R) and the compiler were to generate a operator R.==(x: R, y: R) because mine wasn't attached to R that I'd get an ambiguity, which seems unfortunate. Or if the compiler-generated one simply "won", then I'd be confused. All that said, if the ambiguity were to say "Maybe you should define your operator as a method?" maybe that's not so bad.

Well, we could certainly have the compiler check for, and issue an error on, something like operator ==(x: R, y: R) defined in the same module as the definition of R. It could say "Please use the method form when replacing the compiler-generated '=='" or something to that effect.

(Edit -- I'm talking about doing the check when we find the function definition, not at a call).

lydia-duncan commented 2 years ago

Do you have a stance on whether where-clauses would impact the choice to generate a default function?

Not that you were necessarily asking me, but my instinct would be that the presence of a where clause shouldn't change what we do relative to not having that where clause. That is, if the operator above would thwart the compiler-generated default without the where clause, it should with it as well (and vice-versa).

I agree with this.

That's not what this code [...] looks to me like it's doing. (Not that we have to fix it in this issue, just noting).

This test locks in that they do actually conflict. I'm assuming that code doesn't fully determine a conflict but indicates a slight preference without explicitly choosing one?

bradcray commented 2 years ago

Well, we could certainly have the compiler check for, and issue an error on, something like operator ==(x: R, y: R) defined in the same module as the definition of R. It could say "Please use the method form when replacing the compiler-generated '=='" or something to that effect.

We could, though I worry that that could open up some of the cans of worms that has led Ben to ask these questions. E.g., how R-like would the == need to be in order to generate that warning? Would both arguments have to be R? If R were generic would any instantiation or partial instantiation generate the warning? What if there were a where clause? Whereas if we were guaranteed that we'd get the ambiguity error in cases where it mattered, the need to disambiguate, or to suggest the user add a prefix, could happen then.

Not that I'm opposed to putting the warning more proactively if we think we can land it... it just seems more challenging to implement (and maybe define) to me.