Certain methods in Chapel have special meaning. Broadly speaking, there are methods such as init, this, and these, which are language-ey and play a crucial role in Chapel. However, there are also other methods that aren't as fundamentally ingrained in Chapel, but we would like to have some special meaning: hash, enterThis, and exitThis.

The main issue is that we don't want to take method names like hash away from the user; furthermore, more generally, we don't want to make migration difficult for users if we are to introduce a new special method. Currently, if we just add hash or enterThis, users with methods with the same name would need to rename their methods and refactor their code.

The main issue that covers this subject is https://github.com/chapel-lang/chapel/issues/19038. It also lists an additional concern about special methods. Quoting:

How can I be alerted that I'm opting into a special method rather than happening to use a name that Chapel gives special meaning to?

This issue is all about finding ways to reserve special methods like hash, enterThis, and exitThis, without making programming in Chapel and migrating to newer versions harder for the user, even if new special methods are added in the future.

Approaches in Other Languages

Language	Approach
Python	"dunder" methods, like `__hash__`
C	Reserves `__` for compiler-specific things
C++	Reserves names (e.g. `value_type`), adds operators (`operator bool`). Concepts
Rust	Traits (`Hash` trait)
Java	All objects have some base methods (`hash`). Interfaces for other things.
Swift	Interfaces / Traits with compiler-generated implementations (`Hashable`)
C#	Just reserves names (e.g. `Add`), uses interfaces (`ICollection`)
JavaScript	Prototype objects have a `Symbol.something` property (hard to translate to Chapel)

The Contenders

Below is a brief list of all of the approaches suggested to address this issue.

Note that all of the contenders below solve the issue of preventing code from breaking when new special methods are introduced.

Use special characters to demarcate special methods (issue here). This entails lexer/parser changes. Furthermore, in some cases, code that used to work would need to be changed, but only once; subsequent additions of new keywords would not require user changes.
Use Python's dunder method convention as follows: __hash__ convention. Currently, Chapel does not really reserve names starting and ending with double underscores. This approach requires no parser changes, but does take away names that users could've been using before.
Use chpl_hash. This shouldn't break user code because chpl_ is normally restricted for internal Chapel things. On the other hand: chpl_ is normally restricted for internal Chapel things. So, it's weird for users to have to define a "private-looking" thing on their types.
Use operator hash. This approach would allow proc hash to co-exist with operator hash; the operator version would not be directly callable. Instead, a Chapel procedure top-level non-method proc hash(arg) would be usable for invoking operator hash on any type that supports it.
Use interfaces (issue here). Interfaces won't be solidified by 2.0, but there are proposals that suggest using "proto-interfaces", which represent a subset of interfaces that will definitely stay the same in 2.0. The key difficulty is establishing the proto-interface subset. In particular, an interface Hashable could be defined (e.g.), and a procedure hash(x) where x is Hashable. This would only work if x were hashable.
- Required features: defining an interface with a special method, "implements" statement, generic functions that work on interfaces (not necessarily constrained generics).
- What about compiler-derived implementations of interfaces? In particular, it's important that we preserve the property of Chapel that by default, every new record / class / etc. can be printed out using writeln etc. If writeThis is implemented using an interface, we'll want there to be automatic instantiation of the interface.
  - It's important that the knowledge of whether a type does or doesn't have a default implementation of a particular interface is consistent everywhere. We would't want one piece of Chapel code using a default hash instance, while another uses a user-provided one.
Use type methods to provide namespacing. Create a record Chapel {} or something, and provide type methods like proc Chapel.hash(x: MyType) {}.
- Open question: how to handle dynamic dispatch in class hierarchies?
  - We might not want to have Dynamic dispatch in this situation; if the user wants dynamically-dispatched hash methods, for example, maybe they should declare their own "virtual" hash method and call it from Chapel.hash(x: MyClass).
- This might be a little worse than Methods because the resolver by default brings into scope the primary and secondary method defined on a type. However, it won't implicitly import tertiary methods, which includes methods like Chapel.hash(..), so we'd need to specifically import Chapel to bring in the user-provided implementations of the special functions, everywhere we'd want to use them.

An incomplete list of possible special characters:

@ doesn't occur anywhere else, but might be a bit noisy.
~ is bitwise negation
:: is an option.
$ is an option, but was previously used for sync/atomic variables.
* but could mean any / all
# but too commenty
... More

Do regular (not-special) functions allow the special characters?

Probably shouldn't.

Comparison Table

Note that all of the contenders below -- except the current hash -- solve the issue of preventing code from breaking when new special methods are introduced.

In the following table, I give hash as an example method; however, this applies to any "special" methods that we would want to add, such as enterThis and exitThis, writeThis, etc.

Properties:

✅ -- "good outcome"
❌ -- "bad outcome"
🤔 -- "mixed outcome"

See the details block below for more information of what marks in each column represent.

Approach	Parsing Approach	Does not Reserve Identifiers	User Code Works As Is	Makes Special Method Clear	Precedent Language	Callable Directly? + How	Ease of Use	Notes
`hash`	Works Already	❌	❌	❌	C#, C++	✅ `x.hash()`	✅	Included for comparison
`~hash~`, `@hash@` `--hash--` $hash$	Either	✅	✅	✅		✅`x.~hash~()`	✅
`-hash-` `<hash>` `:hash:` `hash` `#hash#` `!hash!`	Broad	✅	❌	✅		✅`x-hash-()`	✅
`chpl_hash`	Works Already	❌	🤔[^3]	✅		✅`x.chpl_hash()` generic access via `hash(x)`.	✅	`chpl_` is typically internal Chapel.
`__hash`	Works Already	❌	❌	✅	PHP, C	✅`x.__hash()`	✅
`__hash__`	Works Already	❌	❌	✅	Python	✅`x.__hash__()`	✅
`operator hash`	Either	✅	✅	🤔[^2]	C++ `operator bool`	❌ accessed via `hash(x)`	🤔	Special methods aren't really operators.
interfaces	Language Feature	✅	✅	✅	Rust, Java, Swift, Haskell	🤔[^1]`x.hash()`. `(x:Hashable).hash()`	❌	Requires proto-interfaces and the associated design
type methods	Works Already	✅	✅	✅		✅ `Chapel.hash(x)`	🤔	Might require special casing for class hierarchies

Column Details

### Parsing Approach Changes to the parser / lexer come in two flavors, which provide different answers to the following question: Should we specially reserve tokens for each function on a case-by-case basis, or just create a general rule? * __Broader Lexing Approach__: Create a "identifier*"-rule, no further need to reserve tokens or modify the parser. * Some approaches require this, like `x.hash()` => `x-hash-()` * Michael summarizes lexing approaches here: https://github.com/chapel-lang/chapel/issues/19050#issuecomment-1262245757 * __Incremental Lexing Approach__: Reserve `hash*` when needed, then `somethingElse*` when `somethingElse` is needed. * But then, stuff like `bla.somethingElse*(x+y)` will suddenly change behavior, unless... * You use different punctuation to make cases like the above unambiguous. ### Does not Reserve Identifier Some have expressed concern that using approaches like Python's dunder methods, such as `__hash__`, removes valid identifiers that the users could previously use for their code. If there are other ways to mitigate the problem of adding special methods that _don't_ take away any options, this is better, because users have more freedom in picking what they want their procedures are called. Thus, this column has a green check mark (✅) if no previously-valid identifiers are taken away, and a red "x" (❌) if they are. ### User Code Works As Is Some approaches presented here will require modifications to existing user code to make it work. For instance, `x-hash-(1+2)` could previously be the arithmetic expression `x - hash - (1+2)`, but under one of the approaches presented here would be interpeted as a method call. Thus, some approaches will require users to modify their code. This will only need to be done once, after which new special methods would be reserveable without breakage. Approaches that do not break user code at all get a green check mark (✅), while those that require changes are marked with a red "x" (❌). ### Makes Special Method Clear This column indicates whether a given approach makes it clear to the user that what they're invoking is a reserved / special method in Chapel, as opposed to just any other method. Approaches where the use of special methods is distinct from the user of "garden variety" methods are marked with a green check mark (✅), while those where calling a special method looks similar or identical to "usual" Chapel code are marked with a red "x" (❌). ### Precedent Language This column describes other languages that have solved the problem of reserving methods with special meaning in the same way as a particular approach. For instance, the `__hash__` approach is associated with Python, since Python uses dunder methods for "special" functionality. ### Callable Directly Another concern raised during design discussions is that of being able to call a special method directly. If the user writes a particular method on their data structure, it seems to make sense to make it possible for them to call that special method without jumping through any hoops. Approaches where the special method implementation can be called directly are marked with a green check mark (✅), while those where only the compiler can invoke the "special" method -- or where additional work is required to invoke it -- are marked with a red "x" (❌).

[^1]: Only in certain contexts like constrained generics; other times might require a cast. [^2]: It would be accessed via a Chapel stdlib hash, so you'd know it's special; but it does look a lot like if user code provided a library function instead of a method. [^3]: User code probably shouldn't be using the chpl_ prefix; however, some projects might (Arkouda?).

The subteam for this issue is @mppf, myself, @dlongnecke-cray , @benharsh , @bradcray and @e-kayrakli. @vasslitvinov will not be participating, but posted the following message with his opinion:

We have had several interesting proposals like https://github.com/chapel-lang/chapel/issues/21431 . My reaction to them is “do we really need all that complexity?” By contrast, using the chpl_ prefix, as in chpl_hash, while not fancy or breakthrough in language design, is easy to explain and use and gets the job done.

Here is a link to the PR for IO Serializers which describes the currently-stabilized interface: https://github.com/chapel-lang/chapel/pull/22437

The parts that users would implement on their types are:

// writing, as with writeThis
proc MyType.serialize(writer: fileWriter(?), ref serializer : writer.serializerType) throws

// For reading into existing values, as with readThis
proc MyType.deserialize(reader: fileReader(?), ref deserializer: reader.deserializerType) throws

// For reading in types
proc MyType.init(reader: fileReader(?), ref deserializer: reader.deserializerType) throws

We also compiler-generate these methods today whenever possible.

I think the most relevant methods are serialize and deserialize. These would both tend to focus on working with fields, so if we were to add private field support later, then the Chapel.hash and operator hash approaches would be more difficult to work with, and might result in users writing their own methods anyways.

On the namespacing issue, I think it's plausible that we would consider supporting multiple interfaces that both want methods named "serialize" and "deserialize". For example, IOSerializable and CommSerializable.

Lastly I'll add that these methods are meant to be invokable by implementors of Serializers and Deserializers.

@bradcray Because you wanted code comparing Python dundermethods and interfaces.

// Approach 1: Python style dunder-methods...

record rec {
  var x: int;
}

// Define '__hash__' for 'rec'.
proc rec.__hash__(): uint(64) { return hash(x); }

// User defines their own hash method!
proc rec.hash() { return 8; }

proc main() {
  var r = new rec();
  var hx1 = r.__hash__();   // Call directly.
  var hx2 = r.hash();       // This is the user 'hash' method, unrelated...
  assert(hx1 != hx2);
}

// Approach 2a: Interfaces...

record rec {
  var x: int;
}

// This interface lives in ChapelBase or an auto-module...
interface Hashable {
  proc hash(): uint(64);
}

// This implements lives in ChapelBase too...
int(64) implements Hashable {
  proc hash() { return hash_impl(self); }
}

// This implements for 'rec' lives in our source code.
rec implements Hashable {
  // Calls Hashable<int(64)>.hash()...
  proc hash() { return hash(x); }
}

// User defines their own hash method!
proc rec.hash() { return 8; }

proc main() {
  var r = new rec();

  // Compile-time reinterpret as hashable and call Hashable(rec).hash();
  var hx1 = r.Hashable.hash(); // Syntax TBD, see #21343

  // This calls the user's rec.hash() method.
  var hx2 = r.hash();

  assert(hx1 != hx2);
}

// Approach 2b: One option for interface "auto derivation".
// Consider the code in 2a, but adjust...

// This pragma indicates the compiler should automatically generate the
// Hashable interface. It will emit an error if it fails to do so, which
// would be if any field of 'rec' does not implement Hashable.
@autoderive("Hashable")
record rec {
  var x: int;       // The int types implement Hashable in ChapelBase.
  var y: real;      // Ditto...
}

// Default implementation of 'chpl_hashable' using generics + reflection.
proc chpl_hashable(const ref x) {
  use Reflection;
  var ret: uint(64) = 0;
  for name in fields(x) do ret = chpl_hash_combine(ret, field(x, name));
  return ret;
}

// COMPILER GENERATED!
rec implements Hashable {
  proc hash(): uint(64) { return chpl_hashable(x); }
};

// Approach 2c: Another option for interface "auto derivation".
// This is just like '2b', but here 'rec' automatically derives the
// 'Hashable' interface. The user can turn that off by attaching...

// Do not auto-generate the Hashable interface...
@noautoderive("Hashable")
record rec {
  var x: int;
}

// Or further, if the compiler detects a user-written...
rec implements Hashable { /** ... **/ }

// Then it will not attempt to auto-derive.

FWIW, I think Python dunder methods lend themselves very nicely towards transitioning to interfaces in the future. We could simply have the compiler do something like "if an interface Hashable is explicitly found for type T, then Hashable(T) will be used. Otherwise, if a dunder method named __hash__ with a valid signature is found for T then that method will be used. Otherwise, the Hashable interface will be automatically derived for T, unless auto-derivation is explicitly turned off via use of @noautoderive.

@dlongnecke-cray - I don't think the proc hash in ChapelBase is necessary for choosing between __hash__ and the interface idea; indeed either could work without it. I suspect some of us would prefer it not to exist (but I don't personally have a strong opinion on this point). Point is, I think it's something we can make an independent choice on, so it might be worth updating your comment to indicate it is optional in these proposals.

Here is a version of 1 and 2a with slightly different editorial choices, to be even more boiled down to the difference between the two:

Approach 1: Python style dunder-methods

// this is what the record / type author would write

record rec { }

// Define '__hash__' for 'rec'.
proc rec.__hash__(): uint(64) { ... }

// the standard library hashtable would call hash functions like this:
private proc hashSomeKey(x) {
  return x.__hash__();
}

Approach 2a: Interfaces

// this is what the record / type author would write

record rec { }

// indicating that 'rec' is Hashable and implementing the relevant 'hash':
rec implements Hashable {
  proc hash() { ... }
}

// the standard library defines Hashable somewhere
interface Hashable {
  proc hash(): uint(64);
}

// the standard library hashtable would call hash functions like this:
private proc hashSomeKey(x) {
  return (x:Hashable).hash(); // exact syntax TBD; see issue #21343
}

I think it's both an advantage and a disadvantage of the Interfaces approach that implementing the method requires one to use two names: Hashable and hash:

It's an advantage because it allows us to separately manage the name Hashable (which is why it helps with the special method issue but it's also nice because you can rename Hashable on a use if needed etc.) and it enables better checking (it's just as easy to implement two methods for an interface & the compiler knows to check for both).
It's a disadvantage because it requires one to remember the name Hashable as well as the name hash.

I find the "__hash__ would be shorthand for implementing an interface" idea intriguing. It led me to thinking of another thing.

Anyway, you know how we have this.super.init() or even this.super.someMethod() ? Well the .super is not really a field as much as it is a way to reinterpret a class. What if we had the same mechanism available to reinterpret something as an interface it implements? This has been proposed before on #21343.

There are key two points to this comment:

Such a functionality could serve as a namespacing strategy for special methods even before we have interfaces built (just like __hash__ could).
This idea could extend to a way to declare such methods.

Following along with the boiled-down examples from my previous comment, here is what it would look like.

Approach 2k: interface-y rec.Hashable

// this is what the record / type author would write

record rec { }

// indicating that 'rec' is Hashable and implementing the relevant 'hash':
proc rec.Hashable.hash() { ... }

// Note that the compiler could check that such a 'hash' function
// meets the required signature whether or not we do that with
// the prototype constrained generics logic.

// the standard library defines Hashable somewhere
// For this proposal, the main point here is that the standard library defines
// the name Hashable. It could be handled directly in the compiler at first.
interface Hashable {
  // Open Question: do we need it to have 'proc hash' at all at first?
  // proc hash(): uint(64);
}

// the standard library hashtable would call hash functions like this:
private proc hashSomeKey(x) {
  return x.Hashable.hash();
}

In terms of implementation, the compiler can just think of Hashable.hash as a method name. It would translate it to something else, say, Hashable_hash, by the time we get to C/LLVM IR. Likewise the call x.Hashable.hash() would be translated (say, to x.Hashable_hash()).

If we allowed rec.Hashable.hash, that would be committing the ability to implement an interface for a type one method at a time, right?

Rather than say interface Hashable can be empty, I think it would just be better to have the interfaces be hidden in the compiler rather than written out in module code. I think we still have to have a notion of proc hash(): uint(64) stored somewhere so that the compiler can check against the signature of ImplementingType.hash. We already do something similar for special methods today.

If we allowed rec.Hashable.hash, that would be committing the ability to implement an interface for a type one method at a time, right?

I would expect that if you try to implement any of those (proc myType.Hashable.anything(), and the interface has multiple method requirements, then the compiler would check that all of the required methods are implemented by your type.

The only things that this would require us to commit to are:

The names of our interfaces for these groups of special methods Hashable (hash), ContextManager (enter, exit), Serializable ((de)serializers)...
That a user can implement an interface for a type one method at a time, by writing e.g., proc Foo.Hashable.hash (this is not to say that we could not add the block syntax later). For convenience we can restrict things so that all the interface methods have to be in the same scope.
That a user can call an interface method using x.Hashable.hash (or some other syntax in #21343).

To implement this idea, we'd have to:

Have the compiler store our proto interfaces and their method(s) in some fashion.
Adjust the parser to parse dot expressions or a double dot expression for function names proc Foo.Hashable.hash
Teach the production compiler to interpret proc Foo.Hashable.hash as belonging to a "proto-interface" and engage the default method signature checking rules
Teach the production compiler to resolve x.Hashable.hash() to Foo.Hashable.hash.

Is there anything I'm missing? Because implementation-wise, this seems like an actually achievable lift. We don't even have to commit syntax for interface declarations if we just wave our hands and have all these proto interfaces stored in the compiler for now.

Following up to https://github.com/chapel-lang/chapel/issues/22618#issuecomment-1610080643 and @bradcray's request for an example comparing dunder vs interface approaches for the I/O methods.

Here is a comparison.

dunder

// this is what the record / type author would write

record rec { ... }

proc rec.__serialize__(writer: fileWriter(?), ref serializer : writer.serializerType) throws { ... }

proc rec.__deserialize__(reader: fileReader(?), ref deserializer: reader.deserializerType) throws { ... }

// I'm not so sure about this one...
proc rec.__init__(reader: fileReader(?), ref deserializer: reader.deserializerType) throws { ... }

// this is in the standard library somewhere
...
someRecord.__serialize__(writer, serializer);
...
someRecord.__deserialize__(reader, deserializer);
...
// I'm not so sure about this one...
var x = new someRecordType.__init__(reader=reader, deserializer=deserializer)
}

interfaces

// this is what the record / type author would write

record rec { ... }

rec implements Serializable {
  proc rec.serialize(writer: fileWriter(?), ref serializer : writer.serializerType) throws { ... }
}
rec implements Deserializable {
  proc rec.deserialize(reader: fileReader(?), ref deserializer: reader.deserializerType) throws { ... }
}
rec implements DeserializeInitializable {
  proc rec.init(reader: fileReader(?), ref deserializer: reader.deserializerType) throws { ... }
}

Open questions:

Could we combine Deserializable and DeserializeInitializable? Are there better names for these two?
Is the repeated mention of rec in the above necessary? Perhaps we will want a shorter way to write this. Or use Self or something. Nonetheless I'm fairly confident we could stabilize the form above (or something like it), at least for these interfaces.

In the near term, the standard library would call these like this:

// this is in the standard library somewhere, in the near term
// (in the long term, these would be unnecessary, because they
//  can be invoked from constrained generic functions in the natural way)
...
(someRecord:Serializable).serialize(writer, serializer); // see #21343 for options here
...
(someRecord:Deserializable).deserialize(reader, deserializer); // see #21343 for options here
...
// I'm not so sure about this one...
var x = new (someRecordType:DeserializeInitializable)(reader=reader, deserializer=deserializer)

In the long term, it would use constrained generics to do it, which would look like this:

proc doSerialize(arg: Serializable, writer: fileWriter(?), ref serializer : writer.serializerType) throws {
  arg.serialize(writer, serializer);
}
// or with this alternative way of writing a constrained generic:
proc doSerialize(arg, writer: fileWriter(?), ref serializer : writer.serializerType) throws where arg implements Serializable {
  arg.serialize(writer, serializer);
}

The others are similar with constrained generics:

proc doDeserialize(arg: Deserializable, reader: fileReader(?), ref deserializer: reader.deserializerType) throws {
  arg.deserialize(reader, deserializer);
}
proc doDeserializeInitialize(type t: DeserializeInitializable, reader: fileReader(?), ref deserializer: reader.deserializerType) throws {
  return new t(reader=reader, deserializer=deserializer);
}

Conclusion

It is interesting to note that invoking the special initializer isn't smooth sailing with either proposal. But, at least with the initializers approach, the strategy of writing a constrained generic function to call that initializer will work smoothly once we are ready to lean on constrained generics.

Also, I think we might be able to say that invoking these special methods as completely unstable for now:

I/O related ones only need to be invoked by BinarySerializer / the I/O module etc. At present we maintain all of these and for users to write custom serializers there are a bunch of other elements we will need to stabilize.
Context manager ones can be invoked by a manage bla { } block
I think we can get away with not providing a stable user-facing way to use hash yet (since the main purpose of it is to support the builtin hashtables in Map and associative domains)

Nonetheless I think it's important to keep in mind how they might be invoked in terms of long-term design direction.

In terms of fundamental differences between the two (ignoring naming and syntactical choices that can vary within each proposal), I think there are two things:

The interface approach uses 2 names (Hashable and hash) where the dunder approach uses one (__hash__)
The idea that we can have multiple methods with the same name implementing different interfaces does not seem to exist in many languages with interfaces/constrained generics. It exists in Rust, but not Swift, for example [1] [2]. So, if we were uncertain if it is reasonable for Chapel to have that feature, we might want to avoid using interfaces here, because it assumes we have that feature in order to solve the main problem.

In an off-issue discussion, we are thinking:

We can leave the way to invoke the special method unstable for now & we should focus on how it can be defined.
We can leave the issue of indicating a method call from a particular interface for later, once we want to add a new special method. (But using interfaces to declare them will help keep things consistent in the future). Edit: It is important to note that we can add this as a non-breaking change as long as we generate a compilation error any time such a duplicate definition appears.

So that leads me towards thinking, can we arrive at the simplest / most likely to be satisfying in the long term way to write that a particular type implements an interface?

I can think of two candidates, both based upon https://github.com/chapel-lang/chapel/blob/main/doc/rst/developer/chips/2.rst#implements-statements:

implements Option A

A. In the near term, I think it would be acceptable to require the interface implemented be described at the record declaration:

record rec implements Hashable {
  proc hash(): uint(64) { ... }
}

However, this form does not currently parse.

implements Option B

B. Use a separate implements statement:

record rec {
  proc hash(): uint(64) { ... }
}
rec implements Hashable;

This has the advantage of being implemented today.

other notes

For the methods that are compiler-generated by default (hash, serialize, deserialize, and the deserialize initializer), we will need a way to opt out of generating these. For that I propose we have an empty interface Unhashable e.g. record rec implements Unhashable means that the record should not get a compiler-generated hash function. We can also insist (for now) that if a proc hash is present, that implements Hashable is also present.

Note that the details of how to hash the type must be available looking only at the module defining the type. (I.e. we can't have a tertiary method proc hash otherwise bad things can happen). IMO requiring at the type declaration point makes sense in the near term.

I would propose that we use attributes to indicate that a type should not generate a built-in interface. I have been calling the attribute @noautoderive. I do not think this would be a big lift, as Ahmad has already done a ton of work for attributes. Also it avoids us having to commit to "negative interface" names. To not auto-generate Hashable you would just write @noautoderive("Hashable").

implements Option A

[Edit: From reading the meeting minutes I can see that Michael has already emphasized everything I'm about to say as being important to the namespacing, but what's not clear to me is why we've decided to abandon that aspect of the proposal.]

I am worried that this approach does not solve conflicts in the event that a user wants to write both Hashable.hash and their own hash:

record rec implements Hashable {
  proc hash(): uint(64);
  proc hash(): uint(64); // I can't have my own hash I'm using elsewhere?!
}

It seems like the user's choice is to opt in and lose use of the name hash for other purposes, or opt out and not get Hashable.

Why not require:

record rec implements Hashable {
  proc Hashable.hash(): uint(64) { return 0; }
  proc hash(): uint(64) { return 8; }
}

Instead? This would avoid any name conflict. We don't even have to change anything in the parser to be able to write Hashable.hash() as a primary method.

As an argument that the semantics are roughly consistent with implements blocks, if we restrict things so that proc Hashable.hash() can only be defined in the record's primary scope, then how is the above any different than:

record rec implements Hashable { ... }

rec implements Hashable {
  proc hash(): uint(64) { return 0; }
}

In terms of functionality?

implements Option B

record rec {}
rec implements Hashable;

I do not feel like this syntax is appropriate to use. I'm pretty sure it was added with the intention of auto-implementing interfaces by examining surrounding primary/secondary methods. While that might be a nice feature to explore in the future, I don't think it solves the namespacing issue.

what's not clear to me is why we've decided to abandon that aspect of the proposal

we have not, please reach out to me off-issue.

Also, I think we might be able to say that invoking these special methods as completely unstable for now:

I think if we go down this path it means that there won't be a stable way to implement Serializers and Deserializers. Just wanted to point that out in case it wasn't clear.

Users can still use them in a stable way, and implement the relevant serialize/deserialize methods, but adding new formats wouldn't be stable.

I think if we go down this path it means that there won't be a stable way to implement Serializers and Deserializers. Just wanted to point that out in case it wasn't clear.

David pointed this out in the Slack thread, but I will type out the sentiment here: in the near term, the users of the Serializer / Deserializer type will not be affected if we don't have a specific way to call a method from an interface. The reason for this is that the special interface call syntax is only necessary if disambiguation is needed between rec.writeThis and rec.Serializable.writeThis; however, since writeThis has previously been considered a special method (and since we're not introducing a way to define a separate interface method of the same name as a regular method), users won't have code in which the ambiguity is possible. Therefore, they'd be able to invoke the Serializer/Deserializer methods on the type directly; the implements Serializable etc. would only serve to allow the standard library to treat the methods specially.

Here's @dlongnecke-cray's message verbatim, in case I paraphrased incorrectly.

I think the current proposal should handle that because you would just invoke the special methods as you would any other method.

There’s two poles: on the left you have the “interfaces are auto-fulfilled by looking at primary/secondary methods”, which is great for convenience but doesn’t give us the namespace shielding we need. On the right you have “interfaces are explicitly fulfilled within a namespace somewhere (e.g., a implements block or proc Hashable.hash(). We will need the latter to have namespace shielding. However we’re not ready to make the jump by RC-1.

So in this release we won’t have a way to explicitly invoke an interface method, but we also don’t need it until we add the ability to explicitly implement an interface. That’s because interfaces for now are just “auto-fulfilled/auto-implemented” by a user’s primary methods. We’re effectively grandfathering in the special methods, but only for a single release candidate. By RC-2 we’ll have had more than enough time to deliberate on what syntax/semantics we need to get us the namespace shielding we need, and when we add explicit fulfillment we’ll also add the explicit invocation at the same time.

We also avoid the possibility of having collisions by requiring that you must implement the interface if you have a matching method signature (just for RC-1).

On Thursday, we are going to continue our discussion of the best path forward with respect to special method naming -- including discussing the proposed interface-based approach. If we're all in agreement as to the approach, there are still a few open questions about how to proceed.

One such question concerns the names of the interfaces, as well as their methods. To help with such decisions, below are tables consisting of the names of interfaces similar to ours in other languages.

Notably, do we still want to keep enterThis and exitThis as including the word "this"? This may have been a strategy of dealing with special method naming, which would be made obsolete by any decision coming out of this subteam. Thus -- what do we think of enter and exit? Something else?

Hashing

Proposd interface name: Hashable

Language	Hashing Method	Hashing Interface
Python	`__hash__`	`Hashable`
Rust	`hash`	`Hash`
Swift	`hash`	`Hashable`
Java	N/A	N/A
C#	N/A	N/A

Context Managers

Proposed interface name: ContextManager

Language	Enter Method	Exit Method	Context Manager Interface
Python	`__enter__`	`__exit__`	`ContextManager`
Rust	N/A	N/A	N/A
Swift	N/A	N/A	N/A
Java	`close`	N/A	`AutoClose`
C#	`Dispose`	N/A	`IDisposable`

Note: the closest thing to context managers in C#/Java are "try-with-resources", which I document in this table.

Serialization

Note that unlike the other languages used for reference, Chapel actually has need for two deserialization interfaces: one for deserializing into an existing object (e.g. created via default initialization), and one for deserializing into a new object (for cases where the object cannot be default-initialized, for example).

Proposed interface names: Serializable, Deserializable, DeserializeInitializable.

Language	Serialize Method	Deserialize Method	Serialization Interface	Deserializtion Interface
Python	N/A	N/A	N/A	N/A
Rust	`serialize`	`deserialize`	`Serialize`	`Deserialize`
Swift	`encode`	`init(Decoder)`	`Encodable` 🐟	`Decodable` 🐟
Java	`writeObject`	`readObject`	`Serializable`	`Serializable` (combined)
C#	N/A	N/A	`ISerializable`	`ISerializable` (combined)

Another question we might want to discuss within this group (even if we don't officially come to a decision on it on behalf of the Chapel team at large), is the syntax we'd want for marking that a type implements an interface. This aspect is crucial to the efficacy of the interface-based approach: we want users to explicitly opt-in to the methods' specialness, so that user methods that happen to be named after a special method don't end up being used by the language.

For the time being, I propose that we do not consider how one can define methods in an interface's namespace only (addressed, for instance, by Michael's suggestion here). Thus, let us only consider ways of marking a type as one that opts in to special methods.

There are three major candidates we have so far. These are the three:

Approach 1a: `record implements Interface`

This one mirrors Java approach of using the implements keyword when defining the type.

record rec implements Hashable {
  proc hash() {
    // ...
  }
}

Approach 1b: `record implements Interface {`

This one mirrors C# and Swift's approaches of using :, and also seems in the spirit of what we currently do with class inheritance. One downside might be that implementing an interface and extending a class are quite different, and we don't want to create confusion; we might also want to work out how this approach would work for a class that inherits from a parent and implements an interface.

record rec : Hashable {
  proc hash() {
    // ...
  }
}

Approach 2: `record implements Interface;`

This approach uses the existing standalone implements statements that I believe are part of the interfaces design right now. No other language I've found has a similar feature; we would be doing something new.

record rec {
  proc hash() {
    // ...
  }
}
rec implements Hashable;

While this is probably a bigger discussion than the ad hoc team set out to have, Among these record rec: Hashable is the most appealing to me.

One downside might be that implementing an interface and extending a class are quite different

I can probably be on the other side of this argument. I think they are similar enough that the language should look similar when extending/implementing. So, I see similarity as a plus here. An obvious note, but this is also symmetrical to proc foo(x: Hashable)

If we want to avoid using the same syntax as extending, one alternative I can think of is preceding class/record with the interfaces it implements:

Hashable class MyClass: BaseClass {

}

reads nicely. It may be cluttered if there are too many interfaces that MyClass implements. But probably it can be stylistically addressed like:

Hashable, Serializable, Palatable, Personable
class MyClass: BaseClass {

}

I still find putting everything after : to be the better alternative, though.

Proposed interface names: Serializable, Deserializable, DeserializeInitializable

Sheepish to ask, but: the implication here is that we want to support a type having serialize but no deserialize, is that right? My reflex is certainly based on "serialization" as a data movement concept, but I am a little afraid of seeing this triple way too often together for us to wish for a combined interface.

Another question we might want to discuss within this group (even if we don't officially come to a decision on it on behalf of the Chapel team at large), is the syntax we'd want for marking that a type implements an interface.

Note that I've created issue #22652 specifically to focus on this question.

we might also want to work out how this approach would work for a class that inherits from a parent and implements an interface.

That is a concern for both of the record-declaration forms and I showed an example and proposal for each in #22652.

Proposed interface names: Serializable, Deserializable, DeserializeInitializable

Sheepish to ask, but: the implication here is that we want to support a type having serialize but no deserialize, is that right? My reflex is certainly based on "serialization" as a data movement concept, but I am a little afraid of seeing this triple way too often together for us to wish for a combined interface.

Yes, but we can also add interfaces for the combination of these. For example, Swift has Codable (🐟 !) to mean the combination of Encodable and Decodable.

@benharsh and I did some brainstorming on interface names here and we liked:

for the proc init, InitDeserializable
for the proc deserialize, UpdateDeserializable / RefDeserializable / MutableDeserializable / MutatingDeserializable
Deserializable fro the combination of InitDeserializable and UpdateDeserializable

I think it's interesting that Java combines both reading and writing into Serializable. I think it's actually somewhat common for Chapel types to be Serializable but not Deserializable. Of course we could seek to convey such things in a different way (such as throwing an error) but I'd expect we will be better off if we can have the module code react to implementing the Serializable / Deserializable interface (or not).

I think it's super interesting that Rust interfaces don't seem to use the Bla...able style. It would make the names a bit less of a mouthful if we followed Rust in this regard. (Edit: apparently Rust followed Haskell in this regard).

Regardless of the route we take, we want to continue allowing standalone rec implements Interface; declarations.

For example, when using a library type that is not hashable as-is, I may have a need to hash it and a way to define the hash method for it. We want to allow this using tertiary methods and tertiary implements declarations.

The design subteam has reached consensus on large portions of this topic, though we explicitly leave some things for further discussion.

The main decision is that we will use interfaces to reserve "special" methods. Concretely:

hash will become a part of a Hashable interface.
serialize, deserialize, and init-deserialization will be distributed among four interfaces, whose names are not yet decided:
- A WriteSerializable interface for serialize
- A ReadSerializable interface for deserialize
- An InitSerializable interface for init
- A Serializable interface that subsumes the above three for convenience.
- We leave the names of the interfaces for discussion in a subsequent IO subteam. (possibly driven by @benharsh)
enter and leave (roughly) will become part of a ContextManager interface.
- We agreed that -This should be removed from enterThis and leaveThis because that naming scheme was used to solve the special naming issue as well. However, there is an open question about whether it should be leave or exit.
Note that the above choices set the precedent that interface names may end in -able but are not required to.

Users will need to opt in to the specially-named methods being treated in a special way by having their type implement the respective interface (e.g. a record would need to explicitly implement Hash for that to be automatically used by the standard library).

We will allow the compiler to automatically generate implementations of certain methods and implement the related interfaces, so that things like writeln-by-default for user-defined types continue to work. (The compiler would continue to automatically generate implementations for hash, serialize, deserialize, the deprecated deserialize init, and also readThis and writeThis. Note that readThis and writeThis are expected to be deprecated but aren't yet).

Transitionally, in 1.31 (and perhaps a few releases after that), the compiler will emit a warning for code defining a specially-named method that doesn't implement the respective interface.

Things we didn't decide, but need to for 2.0

We did not make all the necessary related decisions. Some of these are arguably out of scope here, and others we just ran out of time for. So, decisions still need to be made for:

Names for the four serialization interfaces
The proper names for context manager methods (though enter and leave would have inertia based on what we have right now).
How to turn off the "method without implementing interface" warning above for the case where a user wishes to define an unrelated method (e.g. an unrelated proc hash)
How to opt out of the compiler generating e.g. proc hash and implements Hashable for a custom type when no competing proc hash is present.
A syntax for implementing interfaces.

Although we didn't decide, in our discussions, we are tending towards the following names / syntaxes:

For "record implements interface",
```
record rec : Hashable { ... }
```
For serialization interface names:
- ReadSerializable, WriteSerializable, InitSerializable, and Serializable (votes from Daniel, David)
- ReadSerializable, WriteSerializable, InitSerializable, and IoSerializable (votes from Engin, Ben, but inconsistency concerns from Daniel)

Things we decided not to stabilize by 2.0

Because interfaces are a major implementation effort and language feature, our approach has been to minimize the aspects of interfaces that we want to stabilize for 2.0. Therefore, we will not be stabilizing:

The way to call an interface method on a type (beyond a "regular" method call when it's unambiguous). We believe this will only be used within the standard library, and thus will not need to be user-facing.
Whether or not interfaces can have their own "namespaces" to allow non-special methods to share a name with special ones. (e.g. a non-special proc hash as well as a Hashable proc hash on the same type).
Many other aspects of interfaces/constrained generics that are implemented in the prototype (we will only stabilize one way of declaring a type implements an interface)
- Additional ways to say a type implements an interface
- Interface-constrained-generic routines (e.g. proc foo(arg) where arg implements Fooable)
- Interface declarations themselves

Next steps

Settle the open questions from "Things we didn't decide, but need to for 2.0".
Implement the proposal.

Closing this because the discussion itself has been settled.

chapel-lang / chapel

Demarcating "special methods" like `readThis`, `enterThis`, and `hash` #22618

Approaches in Other Languages

The Contenders

Comparison Table

Approach 1: Python style dunder-methods

Approach 2a: Interfaces

Approach 2k: interface-y rec.Hashable

dunder

interfaces

Conclusion

implements Option A

implements Option B

other notes

implements Option A

implements Option B

Hashing

Context Managers

Serialization

Approach 1a: `record implements Interface`

Approach 1b: `record implements Interface {`

Approach 2: `record implements Interface;`

Things we didn't decide, but need to for 2.0

Things we decided not to stabilize by 2.0

Next steps

chapel-lang / chapel

Demarcating "special methods" like `readThis`, `enterThis`, and `hash` #22618

Approaches in Other Languages

The Contenders

Comparison Table

Approach 1: Python style dunder-methods

Approach 2a: Interfaces

Approach 2k: interface-y rec.Hashable

dunder

interfaces

Conclusion

implements Option A

implements Option B

other notes

implements Option A

implements Option B

Hashing

Context Managers

Serialization

Approach 1a: record implements Interface

Approach 1b: record implements Interface {

Approach 2: record implements Interface;

Things we didn't decide, but need to for 2.0

Things we decided not to stabilize by 2.0

Next steps

Approach 1a: `record implements Interface`

Approach 1b: `record implements Interface {`

Approach 2: `record implements Interface;`