chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.79k stars 421 forks source link

Proposal: initialization expressions for associated types in interfaces #23883

Open DanilaFe opened 11 months ago

DanilaFe commented 11 months ago

Background

Currently, specifying associated types can be cumbersome, and they can in some cases be inferred. For instance, in the world of context managers:

  interface contextManager {
      /* ... */

      /* roughly the following code: */
      type contextReturnType;
      proc Self.enterContext() ref : contextReturnType;

      /* ... */
}

If this were a normal user-facing interface (i.e., if compiler magic were not involved), to implement this interface, the user would need to explicitly provide a contextReturnType:

record myContextManager : contextManager {
  proc contextReturnType type do return int;
  proc enterContext() do return 42;
}

This is awkward, particularly since the return type (42) can easily be inferred from enterContext(). In the compiler, we provide a special-case behavior for contextReturnType in which the compiler attempts to resolve enterContext() and determine its return type. This is fine for the purposes of 2.0, but not in terms of interface ergonomics -- it seems plausible that user interfaces will also have varied return types (e.g.), or more generally, associated types that can be inferred (or attempted to be inferred) from other type methods.

Proposal: type initialization expressions as defaults

What if you could write the following:

  interface contextManager {
      /* ... */

      /* roughly the following code: */
      type contextReturnType = self.enterContext().type;
      proc Self.enterContext() ref : contextReturnType;

      /* ... */
}

Here, I used self as a stand-in for a value of type Self. This isn't strictly necessary: we can easily create a function:

proc valueOfType(type t): t {
  throw new Error("You wish haha");
}

And then define self to be valueOfType(Self) .

Either way, the idea here is to be able to write generic expressions that, when the compiler attempts to implement an interface, can be resolved to figure out a "default" type. These need not resolve, but failure to resolve them would mean the user would have to manually specify a type as shown originally.

proc worksOnlySometimes(type t) {
  if t == R1 then return int;
  compilerError("Simulating a type in which the initializer doesn't resolve!");
}

interface myInterface {
  type assocType = worksOnlySometimes(Self);
}

// Works: worksOnlySometimes(R1) resolves to int, so the interface can be satisfied with no extra work
record R1 : myInterface {}

// Doesn't work: worksOnlySometimes doesn't resolve for R2, so the user needs to explicitly specify a type.
record R2 : myInterface {}

// Works: user explicitly provides assocType.
record R2 : myInterface {
  proc assocType type do return int;
}

I believe this meshes well with our understanding of type fields in records (they have a default that can be overridden) and of default functions for interfaces (an interface can declare a method / proc to be used if the implementing type doesn't implement it). It also seems relatively straightforward to implement.

lydia-duncan commented 11 months ago

This proposal certainly feels powerful and like it would make users' lives easier.

I am a little worried it will complicate the implementation and make it more complicated for users to understand, but I wouldn't necessarily stand in the way of moving forward with the proposal for that reason (just making sure I voice the concerns in case they resonate or make other concerns more clear).

  interface contextManager {
      /* ... */

      /* roughly the following code: */
      type contextReturnType = self.enterContext().type;
      proc Self.enterContext() ref : contextReturnType;

      /* ... */
}

This example feels like a circular dependency to me, even though the circularity should be broken by the user providing either the type or the function as part of implementing the interface. I would be worried about the compiler's behavior in the case where neither has been defined.

proc worksOnlySometimes(type t) {
  if t == R1 then return int;
  compilerError("Simulating a type in which the initializer doesn't resolve!");
}

interface myInterface {
  type assocType = worksOnlySometimes(Self);
}

// Works: worksOnlySometimes(R1) resolves to int, so the interface can be satisfied with no extra work
record R1 : myInterface {}

// Doesn't work: worksOnlySometimes doesn't resolve for R2, so the user needs to explicitly specify a type.
record R2 : myInterface {}

// Works: user explicitly provides assocType.
record R2 : myInterface {
  proc assocType type do return int;
}

Supporting this example will make it more difficult to fully know what it takes to implement an interface without access to its documentation. If the library where myInterface is defined is both not well documented and closed source, the implementer of R2 would have no possible way to write their code correctly the first time if they were trying to mimic R1's use. It's beneficial for the developer of R1 but at the cost of more difficulty for the developer of R2. Arguably that is the fault of the developer of myInterface, but we would have enabled it.

Again, neither of these concerns would cause me to block this proposal. They just give me a bit of pause, so I figured I should voice them.

DanilaFe commented 11 months ago

This example feels like a circular dependency to me, even though the circularity should be broken by the user providing either the type or the function as part of implementing the interface. I would be worried about the compiler's behavior in the case where neither has been defined.

In my mind, this example isn't circular, but I suppose that highlights a confusing aspect of my proposal. On the right-hand side of =, I envision the types of Self and self to have already been instantiated with whatever type is being used for an implements statement. Thus, if I were writing record R : contextManager, Self would be R and self would be a value of type r. Then, there's no ambiguity as to what self.enterContext() refers to, nor any circularity: we're not relying on Self.enterContext() (the method required by the interface), but on the actual enterContext method on R.

Supporting this example will make it more difficult to fully know what it takes to implement an interface without access to its documentation.

This is true, but in my opinion, the user shouldn't always have to know everything required to implement an interface. For type-theoretic reasons, we have to have an associated contextReturnType as part of the contextManager interface, but we don't want the user to care, because as long as they write an enterContext(), their contextManager implementation should be good.

I see your point that it allows users to write shoot-yourself-in-the-foot interfaces with odd properties, though. This is a valid concern that should be weighed when making decisions on this proposal.

My particular example with R1/R2 was not meant to be a demonstration of a "good" Chapel interface, just the simplest example in which the behavior I wanted to demonstrate would arise.

mppf commented 11 months ago

This proposal seems reasonable to me. My main concern is that it might create new and different situations in the compiler's handling of interfaces that are hard to implement. This is similar to Lydia's concern; but here I don't have any specific cases in mind. In other words, my opinion is: it sounds good enough to try, but we have to see if the implementation paint us into a corner on how the compiler can handle interfaces / constrained generics. Of course, it might be possible to predict this without actually implementing anything.