chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.78k stars 420 forks source link

Proposal: `&` and `|` interfaces #23879

Open DanilaFe opened 11 months ago

DanilaFe commented 11 months ago

Background

We have introduced interfaces in 1.32 as "the way" to mark a type as implementing certain special methods. However, interfaces as they are in Chapel today are not sufficient to properly support the language features that rely on them. There are two such cases:

Serializers / deserializers: For these, we have settled on a "parent interface", serializable, that is composed of three other interfaces, initDeserializable, readDeserializable, and writeSerializable. Currently, interfaces can't be "combined" into a bigger interface in this way.

Context managers: Context managers rely heavily on the return intents of the enterContext functions to decide which (out of potentially several), should be called. For instance, consider the following code:

record R {
  var x = 42;

  proc enterContext() {
    writeln("In value-based method!");
    return x;
  }
  proc enterContext() ref {
    writeln("In ref-based method!");
    return x;
  }
  proc enterContext() const ref {
    writeln("In const-ref-based method!");
    return x;
  }

  proc exitContext(in err: owned Error?) {}
}

A different enterContext function will be called depending on the form of the manage statement.

var r: R;
manage r as const y {}     // "In value-based method!"
manage r as ref y {}       // "In ref-based method!"
manage r as const ref y {} // "In const-ref-based method!"

The user can define any of the 7 combinations of enterContext functions and still keep their type "suited" for a manage statement (though not necessarily any manage statement -- if a user provides only a const ref overload, they won't be able to use a manage as ref statement).

Defining an interface for this is not currently possible, because interfaces expect a consistent list of functions that each type needs to implement. We would only be able to require a particular combination of overloads (one of the seven possible). However, since each of the functions does have a fixed signature, it seems like what we really want is to define a separate interface for each:

interface valContextManager {
  proc Self.enterContext();
  proc Self.exitContext(in err: owned Error?);
}

interface refContextManager {
  proc Self.enterContext() ref;
  proc Self.exitContext(in err: owned Error?);
}

interface constRefContextManager {
  proc Self.enterContext() const ref;
  proc Self.exitContext(in err: owned Error?);
}

Proposed Solution: & and | interfaces

'&' interfaces

My proposed solution flows naturally out of the need to define serializable as an interface that combines the three possible sub-interfaces into one. One might intuitively attempt to define such an interface as follows:

// NOT what I'm proposing
interface serializable {
    Self implements readDeserializable;
    Self implements initDeserializable;
    Self implements writeSerializable;
}

The above definition of serializable works for method and function signatures: if a type implements serializable, our constraint generic functions already do the work to expose the functionality provided by nested implements statements. However, we run into trouble trying to implement the interface.

// Doesn't work -- we don't implicitly implement interfaces for types.
record myType : serializable { ... }

// Today, we'd need the following:
record myType : serializable, initDeserializable, readDeserializable, writeSerializable { ... }

We could work around this with some compiler assistance, of course, and achieve our results. But then, the solution is a bit unsatisfying: we have a fairly verbose interface definition for serializable, and we have compiler magic to support it, meaning users are not able to implement such "combination interfaces" themselves. I propose extending the language with a '&' interface that serves a similar purpose:

interface serializable = readDeserializable & initDeserializable & writeSerializable;

Notionally, I'd expect writing serializable to be "just like" simply writing the three interfaces that make it up. Thus, the following two lines would be (notionally) equivalent.

record myType : serializable { ... }
record myType :initDeserializable, readDeserializable, writeSerializable { ... }

This has the following advantages:

'|' interfaces

A curious thing to do when faced with something that kind of looks algebraic, like our &, is to see about its dual. The dual of & is |, and it's exactly this sort of interface that will help the problem with context managers. Instead of requiring the user to implement every interface, it requires only a single interface:

interface contextManager = valContextManager | refContextManager | constRefContextManager;

// All of the following are valid.

record A : contextManager {
  proc enterContext() { ... }
  proc exitContext(in err: owned Error?) { ... }
}

record B : contextManager {
  proc enterContext() ref { ... }
  proc exitContext(in err: owned Error?) { ... }
}

record C : contextManager  {
  proc enterContext() const ref { ... }
  proc exitContext(in err: owned Error?) { ... }
}

record D : contextManager  {
  proc enterContext() { ... }
  proc enterContext() ref { ... }
  proc enterContext() const ref { ... }
  proc exitContext(in err: owned Error?) { ... }
}

Note the uses of & and | instead of && and || for these interfaces: this is intentional. Though it was less important in the & case, the semantics of both of the new interface types I'm suggesting attempt to satisfy / implement every constraint in their definition: there is no short-circuiting. Thus, above, record D will implement valContextManager, refContextManager, and constRefContextManager. This seems like the most reasonable approach, since the type clearly meets the constraints of all three interfaces, and thus should be usable in constraint generic functions that accept only constRefContextManager or refContextManager types.

It should be noted that the semantics of resolving constraint-generic functions are quite different from those of resolving regular generic functions: a constraint generic function is resolved only once, with interface-provided functions being resolved to their interface definitions (e.g. a call to foo will be resolved to a Self.foo() definition in an interface). Then, when a constraint generic function is resolved, the calls to Self.foo() are replaced with the witnesses from implementing the given interface. We skip function resolution, as well as return intent overloading.

With that in mind, how can we resolve a function that has A | B as an argument constraint? Since we don't know for sure whether A or B is implemented, I propose we opt for assuming neither. In this way, a formal with no known interfaces and a formal with a disjunction interface will be equivalent.

// Hypothetical example code
interface I = A | B

proc f(x, y: I) {
  // x and y both have no methods / types / etc. available to them
}

For context managers, that's enough -- compiler support can be used to figure out the rest. However, it would obviously be unsatisfying to introduce a language feature like this only for the sake of context managers, and make it unusable without compiler magic. There are ways to use |-interfaces from a user perspective: they just require additional syntax and implementation work, so I propose leaving them out for an initial prototype. That said, I envision a form of if-style syntax working here.

proc f(x: I) {
  if x.type implements A {
    // compiler knows that x: A, methods from A now available.
  } else if y.type implements B {
    // compiler knows that x: B, methods from B now available.
  }
}

This would complicate the resolution process of constraint-generic functions somewhat (which currently do not require param-resolution after instantiation). However, I think this works quite well with the existing semantics of the "implements expression" x.type implements A, which already occurs in where clauses in constraint generic function signatures:

proc someFunction(a) where a.type implements someInterface &&
                           a.type implements anotherInterface {

}

Finally, I think this interface presents a way to implement a relatively long-standing desire for users to implement their own typeclasses such as 'numeric' using disjunction. It doesn't quite do it out of the box (you can't use int(64) in an interface disjunction), but I can see it being useful in that way.

What it will all look like

If both conjunction (&) and disjunction (|) interfaces are implemented, user code will not have to change: types implementing serializable and contextManager will continue to work as usual. Library support code will look something like this:

// These are currently empty due to implementation weakness in implementing
// complex function resolution signatures.
interface readDeserializable {}
interface initDeserializable {}
interface writeSerializable {}

interface serializable = readDeserializable & initDeserializable & writeSerializable;

interface valContextManager {
  proc Self.enterContext();
  proc Self.exitContext(in err: owned Error?);
}

interface refContextManager {
  proc Self.enterContext() ref;
  proc Self.exitContext(in err: owned Error?);
}

interface constRefContextManager {
  proc Self.enterContext() const ref;
  proc Self.exitContext(in err: owned Error?);
}

interface contextManager = valContextManager | refContextManager | constRefContextManager;
benharsh commented 11 months ago

Would I be able to combine & and | in the same expression? For example, I could imagine wanting to define serializable as writeSerializable & (readSerializable | initDeserializable). That is, "writeSerializable and at least one way to deserialize".

DanilaFe commented 11 months ago

I didn't think it would be essential, but it's possible. This does place us straight into the "boolean satisfiability problem with interfaces", but that seems relatively benign.

A relatively unrelated diversion Your definition of `serializable`, though, is not in line with our current understanding of `serializable` (which we until this point have treated as "all three").
lydia-duncan commented 11 months ago

This is an interesting proposal! I would like to think more about the | handling, this part:

Though it was less important in the & case, the semantics of both of the new interface types I'm suggesting attempt to satisfy / implement every constraint in their definition: there is no short-circuiting. Thus, above, record D will implement valContextManager, refContextManager, and constRefContextManager.

feels off to me. But I think what you mean is that "if all contraints are fulfilled, then all constraints will be implementable"? So something like:

record E : contextManager  {
  proc enterContext() { ... }
  proc enterContext() const ref { ... }
  proc exitContext(in err: owned Error?) { ... }
}

would be usable with valContextManager and constRefContextManager in addition to contextManager, but not refContextManager? Which seems perfectly reasonable to me, barring any issue that we discover as part of implementing it

DanilaFe commented 11 months ago

I guess what I mean is that if I = A | B | C, and A is satisfied, it doesn't give up trying to satisfy B and C. This way, the type can still be used for B and C, even though in terms of "or" evaluation, only A is needed.

would be usable with valContextManager and constRefContextManager in addition to contextManager, but not refContextManager? Which seems perfectly reasonable to me, barring any issue that we discover as part of implementing it

Yup, that's what I envisioned.

mppf commented 10 months ago

interface serializable = readDeserializable & initDeserializable & writeSerializable;

I'm not so sure that this is the meaning of & that I would expect. I think what you're going for here is that the interface is the union of all of these 3 RHS interfaces. But I would think of & as meaning intersection; as in, only those methods/functions that appear in all 3 of the RHS interfaces would appear in the new interface.

My counter-proposal is to treat it instead as multiple inheritance of interfaces:

interface serializable : readDeserializable, initDeserializable, writeSerializable {
}

As far as I know, multiple "inheritance" of interfaces is relatively understandable. At least, Java allows this sort of thing (but, granted, Java interfaces are very different).

One thing I would like to know is how Swift handles this case.

mppf commented 10 months ago

For the | interfaces and the context manager case, I think we should just treat it as a special case in the compiler (where it knows what record D : contextManager means). Why do I think this?

I think we should drag our feet on creating a way for users to define an interface like contextManager until we have a motivating case.