ponylang / rfcs

RFCs for changes to Pony
https://ponylang.io/
61 stars 48 forks source link

Use Site Variance and Implicit Interfaces for Generic Types. #123

Open jemc opened 6 years ago

jemc commented 6 years ago

Rendered.

plietar commented 6 years ago

Nicely written RFC, thanks.

A couple of points, about the text firts of all.

Rather than calling the type ElementsAny I find the terms Sink and Source very intuitive when thinking about variance.

type ElementsSourceis Elements[+Data]
type ElementsSink is Elements[-Data]

Additionally, just for completeness could you add a method to Elements where A is used both in a covariant and contravariant position.

Finally I think explicitly stating examples of what this means in terms of subtyping would be useful:


About the feature itself, I don't really like the +/- syntax. I only get the two mixed up and I frequently have to look up which one is which. I think keywords such as Elements[out X]/Elements[in X] may be better (or super/extends, like Java does).

jemc commented 6 years ago

I used the name ElementsAny because it's the name I used in my real-world example that I pulled in from the pony-resp library. However, I agree that the RFC would be more clear if I deviated more from the example in the ways you mentioned. I'll revise that soon.

jemc commented 6 years ago

I don't really like the +/- syntax. I only get the two mixed up and I frequently have to look up which one is which. I think keywords such as Elements[out X]/Elements[in X] may be better (or super/extends, like Java does).

I'm hesitant to reserve new keywords for this, especially one as common in code as the word out. Maybe we could get away with Elements[<Data] and Elements[>Data] as implying "in" and "out", respectively? That is, if the arrow points leftward toward the generic type name Elements, then it's an "Elements sink"; if it points rightward, away from the generic type name Elementes, then it's an "Elements source".

SeanTAllen commented 6 years ago

Without a really good reason, I would prefer to not add new keywords. I dont see much difference between +/- </> super/extends in/out

except that a couple of those add new keywords.

i think in/out would be really had in terms of breaking user code.

plietar commented 6 years ago

Some notes from today's sync.

Is Cell[-A] an interface or not?

A concern I had is that this conflates two distinct features, making an interface from a type and extracting the co/contra-variant subsets of types.

The first one is not related to generics, and would be useful for all classes. Given the class Foo, we want a way to say "any type which implements the same interface as Foo". Using some strawman syntax, the interface IFoo below would be equivalent to IFooExplicit.

class Foo
  fun do_stuff() => ...
interface IFoo for Foo
interface IFooExplicit
  fun do_stuff()

The second, and the original point of the RFC is to create a co or contravariant version of a type. For example CellSink[A] is a contra-variant version of Cell[A]. This means Cell[Any] ≤ CellSink[Any] ≤ CellSink[Bar] (where ≤ is subtype), but only the set method can be used with a CellSink, not the get

class Cell[A]
  var value: A
  fun get(): A => value
  fun set(value': A) => value = consume value'

type CellSink[A] is Cell[-A]

We could keep these as two distinct features which can be combined. For example ICellSink[A] is an interface with just the fun set(value': A) method. So AlternativeCell[Any] is a subtype of ICellSink[Bar], but not a subtype Cell[-Bar].

interface ICellSink[A] for Cell[-A]
class AlternativeCell[A]
  fun get(): A => ...
  fun set(value': A) => ...

However @sylvanc convinced me that there's no legitimate use case for Cell[-Any], and you would always want the interfaced version of it, ICellSink[Any].

So the alternative described in this RFC is to make Cell[-Any] directly be an interface, and AlternativeCell[Any]≤ICellSink[Bar]. This sounds quite reasonable.


Unfortunately this means there's no way to do the "make an interface out of a type" without also making the interface co/contra-variant, such as the IFoo interface from Foo, or an invariant ICell[A] interface which has both the get and set method. We would need a special, dedicated syntax for this, which overlaps with the features provided by the +/- syntax.

Syntax

As I mentioned before, I find +/- quite meaningless, and always have a hard time to remember which one is which. I suggested the in/out keywords which I find a lot easier to understand. Cell[in A] is a Cell in which you can write values (A goes "in" the cell), and Cell[out A] is a Cell from which you can read values (A goes "out" of the Cell).

in and out are obviously very popular names, and we wouldn't want to make them reserved keywords. However, since types must begin with an uppercase, in and out are already not allowed in a type parameter position. This allows us to make them "contextual keywords", where they are normal identifiers when used as function/variable names, but a keyword when used where a type is expected.

This would be a first for Pony, and a bit of a Pandora's box.

edit: I had screwed up the direction of subtyping (obviously), hopefully fixed now

plietar commented 6 years ago

Here's a suggestion on the "is Cell[-A] an interface or not" question.

We add the syntax ~Foo to designate the interface which has the same methods as Foo. ~ is used because that interface is approximately a Foo. This allows us to use ~Cell[Bar] for the invariant interface with both "get" and "set" methods.

On top of this, we add the +/- (or in/out) syntax, to specify co/contravariant type arguments. These are only allowed on a ~ type. ~Cell[-Any] is the interface with only the set(x: Any) method, and ~Cell[+Any] is the interface with only the get(): Any method. Cell[-Any] and Cell[+Any] (without the ~) are invalid.

These are allowed anywhere a type is allowed, including type aliases.

type IFoo is ~Foo
type ICellSink[A] is ~Cell[-A]
type ICellSource[A] is ~Cell[+A]
plietar commented 6 years ago

Another thing I've been wondering is what happens to methods which have the type argument in both argument and return position.

Let's say the Cell type is defined as :

class Cell[A]
  var value: A
  fun get(): A => value
  fun set(value': A): A^ => value = consume value'

In this case, what does the ~Cell[in A] interface look like? We can't just take the set method with the same signature, but we could instead make it be one of:

  fun set(value': A): Any
  fun set(value': A): None

The first requires Any to be a top type. It pretty much is today, with the exception of nosupertype annotated types (RFC 121). The second is not correct from a strict point of view and probably a bad idea, but sort of makes sense since you can't do anything useful with a None.


This actually applies to all co-variant methods are still included in contra-variant interfaces. ~Cell[in A] would actually be equivalent to:

interface ICellSink[A]
   fun get(): Any
   fun set(value': A): Any

The get method can be called, but the return value cannot be used.

Technically we could do the same thing the other way round, ~Cell[out A] becomes equivalent to:

interface ICellSource[A]
   fun get(): A
   fun set(value': Bottom): A

But we don't have a Bottom type, and the set method could never be called, so it's useless to include it

plietar commented 6 years ago

So we discussed this a bit more with @sylvanc and @theodus over the weekend, and we've come to the conclusion that this may not be a feature we want in the end.

First of all, and maybe the biggest red flag, is that automatic extraction of an interface from a class makes it impossible for libraries to add methods while maintaining backwards compatibility.

If classes Foo and Bar both only have methods a and b, then you would infer that Bar ≤ ~Foo, and downstream packages can start relying on that subtype relation. However adding a method c to Foo would break that.

We've don't have any definition of what change we want to allow to be backwards compatible, but adding new methods to classes surely seems uncontroversial.

More generally, and without even taking backwards compatibility into account, classes usually have a much broader surface than interfaces do. If we look at an existing "extracted" interface, ReadSeq[A] actually only includes a small portion of Array's covariant methods. Using ~Array[A] instead of ReadSeq[A] heavily restricts which methods can be used, even though in general you only care about a subset of them. If you need random access, use a ReadSeq[A] (which should maybe be renamed). If you need push/pop define a StackIn[A] or StackOut[A] interface and use that instead.

In other words, rather than imposing the full set methods, we should encourage people to define exactly what set of methods they require as an Interface.

jemc commented 6 years ago

I think the main reason you probably reached that conclusion is the emphasis placed on the "extracting an interface from a type" part of the feature, which as far as I'm concerned is not the main focus.

My original purpose for trying to push this idea forward was always with the intention of having a convenient type for interacting with different reifications of the same generic type. @sylvanc put forth the idea that it could be an open interface, allowing for any viable type to fill it. I have no problem with this approach, but that was never my goal. Every time I've wanted to have something like Cell[+A], I'm always talking about a varying reification of the Cell type specifically - never needing to substitute an "imitation" of Cell.

So I don't think the issue you raised in your last comment is really a problem in that paradigm. If you think it would help prevent confusion about the purpose of the feature, we could consider adding the restriction you were advocating for earlier, where our type system will only allow reifications of the named type rather than acting as an open interface.

This would also remove the need for the ~ syntax addition you were discussing.

sylvanc commented 6 years ago

I think I understand @jemc 's use case better now, and I think this is more in the vein of higher-kinded types. Let's think about expanding this RFC to be a higher-kinded types for Pony RFC. I propose we all have a think and chat about this some more.