Add keyword that refers to a class' type for use in type parameters

dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.

https://dart.dev

BSD 3-Clause "New" or "Revised" License

10.24k stars 1.58k forks source link

Add keyword that refers to a class' type for use in type parameters #28477

Open itsjoeconway opened 7 years ago

itsjoeconway commented 7 years ago

Consider the class Foo, which has a property that is some generic type wrapping instances of itself. And a subclass of Foo, Bar.

abstract class Foo {
   Wrapper<Foo> wrapped;
}
class Bar extends Foo {}

Statically, Bar.wrapped is an instance of Wrapper<Foo>, but truthfully, it is Wrapper<Bar>.

It would be useful to have a keyword that allows the type parameter to reflect the true class, such that:

abstract class Foo {
   Wrapper<Self> wrapped;
}
class Bar extends Foo {}

Where Bar.wrapped is statically typed to Wrapper<Bar>.

MichaelRFairhurst commented 7 years ago

The self type can be implemented as follows:

abstract class Foo<F extends Foo<F>> {
  Wrapper<F> wapped;
}

class Bar extends Foo<Bar> {}

This is basically the self type, and it also demonstrates why the self type is more complex to add than one might suspect.

eernstg commented 7 years ago

The main difficulty with self types is that the type of self in any given source code location is not a compile-time constant, it is actually a bounded existential type: Inside your Foo class the Self type could denote Foo in one situation at run time and a subtype SomeClassThatExtendsFoo in another situation; we don't know which type it is, we only know that it is a subtype of Foo ("there exists a type T which is a subtype of Foo, and then Self == T holds"). Java wildcards is another example of existential types.

Existential types are somewhat difficult to handle in a sound typing context (especially if the type is not reified at run time, in which case we cannot check dynamically whether any given value has that type), but a useful intuition is that we can read values of an existentially-typed expression, but we cannot mutate an existentially typed variable (because we can never know for sure whether the new value is type correct).

The interesting twist with Dart is that we can be much more permissive. We already have a lot of implicit downcasts, and this is just one more variant thereof. In other words, if we are happy about covariant generics and assignability then we can certainly introduce a Self type, reify it at runtime (just like type arguments), and check it dynamically. Of course, we still have the same level of type safety in all the situations where existential types can be used in more strict languages, but now we can also use them in the "opposite" cases, which basically means for mutation.

Developers can choose use a strictly typed style (if tools point out all downcasts, e.g., based on a config setting), and then is and as expressions could be used to verify at run time that the ad-hoc reasoning which would have to be used in order to justify a downcast to an existential type is actually justified, and branch off to some other code in the cases where it's not. Strong mode goes pretty far in this direction, and settings like "implicit-downcasts: false" allow developers to go that extra step.

The question is whether it would be a sufficiently useful feature, but I don't think it would be particularly hard.

Cool, isn't it? ;-)

Here's an important body of research for a detailed study of self types: Kim Bruce's PolyTOIL, journal version, original conference paper. Unfortunately I couldn't find any variants of this paper which is available to the public, but I'm sure there must be a pre-print PDF somewhere.

MichaelRFairhurst commented 7 years ago

To provide an example for Eric's point about where maintaining soundness of the self type is hard:

class Super {
  foo(self x) {
    ...
  }
}

class Sub {
  int bar;
  foo(self x) {
    x.bar++;
  }
}

(new Sub() as Super).foo(new Super());
// will compile, but wil throw in Sub.foo() on line 1: Foo has no member `bar`

Parameters are contravariant, you cannot lower them. But the self type is lowered upon extension.

There are a couple "easy" solutions, such as disallowing self from being used in a contraviant way, or overriding methods which use self in a contravariant way.

But....this actually looks almost exactly like dart's covariant arguments.

Would it be possible to implement contravariant self usage as sugar for a covariant static type?

class Super {
  foo(covariant Super x) {
    ...
  }
}

itsjoeconway commented 7 years ago

Appreciate the feedback.

With regards to usage, I've found Self (not to be confused with self) in Swift to be very useful to compose objects through multiple protocols.

An example of a similar pattern in Dart that already exists is ListMixin - where the implementation of a few core methods gives you a broader suite of methods to work with because they wrap the core methods. This is all okay because the type parameter is explicitly declared, but the idea would be to extend the same behavior to subclasses without having to specify the type parameter.

For that reason, I do see this as just sugar because it is all possible with either casting or @checked currently - the only difference being that the property does have to be redeclared in the subclass with a tighter type.

MichaelRFairhurst commented 7 years ago

Self is a cool type, for sure. ListMixin is a good example use case.

But Self couldn't be sugar everywhere. Consider:

class Foo {
  Self copy() {
    return new Foo();
  }
}

class Bar extends Foo {
  Whiz bang;
}

new Bar().copy().bang;

Bar will inherit copy from Foo, which means Bar.copy() will return a Foo. In the sugar-based option, the type of Bar.copy is Foo, in which case the code will fail to compile at bang since its not a member of Foo. But usually that's not what people expect from the self type.

The correct compile time error is at return new Foo(). new Foo() types to the exact type Foo, and Foo is not a subtype of Self, even though Self is a subtype of Foo.

  Self copy() {
    return this; // this is OK
  }

  Self copy(Self s) {
    return s; // this is also OK
  }

  Self copy() {
    return otherMethodThatReturnsSelf(); // also OK
  }

It gets weirder still if you try to make clone somehow work.

// you can try to make the actual cloning step into a separate abstract method
abstract class Base {
  int x;
  Self _constructNewSelf();
  Self clone() {
    // and use it here. This will typecheck
    Self copy = _constructNewSelf();
    copy.x = x;
    return copy;
  }
}

// but all you do is kick the can down the road.
class Sub {
  Self _constructNewSelf() {
    // because this still won't pass typechecks.
    return new Sub();
  }
}

// you can also try moving _constructNewSelf into a function
class Base {
  Self func() constructSelf;
  ...
}

// but now you realize that this must be a type error somewhere.....
Sub sub = new Sub();
Base base = sub;
// because this sets up garbage
base.constructSelf = () => new Base();
// sub.clone() will now return a Base not a Sub!

This type error goes for all members which reference type Self. Once again, the concrete type Base is not a subtype of Self even though you know its a Base in that moment. If you ever wanted to assign a concretely typed value to constructSelf, you'd need special rules for constructors (new Sub() is the only place where we know for a fact that Self is a Sub).

But not all is lost! This is where the Foo<F extends Foo<F>> construct works really well.

abstract class Base<S extends Base<S>> {
  S func() constructCopy;
  S copy() {
    // notice, `return this` would not fail to typecheck.
    // but `return new Base()` will fail
    // and returning constructCopy() works whether its a member or abstract method
    return constructCopy();
  }
}

class Sub extends Base<Sub> {
  @override
  Sub func() constructCopy = () => new Sub();

  ...
}

The typechecking we wanted from Self is respected here. You can define constructCopy because S becomes the concrete type Sub. And you can't cast down to Base without tracking the original S, so assignments to variables referencing S will still work even on the base class.

Self is a very fascinating type. I'm not trying to say that it can't be done, just that it has a lot of unexpected complexity. Looks like Eric is willing to add it despite that complexity if there are good use cases.

I wonder about any other examples where it would be useful. The usual example for the self type is fluent APIs:

class FluentBase {
  FluentBase doOneThing() {...}
}

class FluentThing extends FluentBase {
  FluentThing doSomethingElse() { ... }
}

new FluentThing() // of type FluentThing
  .doOneThing() // now of type FluentBase
  .doSomethingElse(); // error! FluentBase has no method doSomethingElse!

However, this isn't an issue in dart because you don't write these kinds of APIs, you just use the double dot. The double dot operator does not complain about this code.

You might think that equals(Self other) would be another good example, but this code is not sound because Self is used contravariantly. That means it either must be a compile time error or a runtime error (in ("" as Object).equals(0 as Object) an int goes into String.equals(self)). The idea of equals is not for it to throw an exception when the types don't match, but to return false.

Maybe you could elaborate on how you use Self in swift? I know there are good use cases out there and it'd be a great idea to document them.

itsjoeconway commented 7 years ago

Sure, and then I'll go over the use case I'm thinking of in Dart and perhaps there is a better solution.

In our Swift model <-> web service binding library, there is a protocol that model objects implement to translate between JSON, Swift instances and Core Data (SQLite ORM). Since the same object will get fetched from a web service more than once, we don't want to persist duplicates; therefore, this protocol implements a method with the following signature:

static func insertOrFindInstanceInContext(_ context: NSManagedObjectContext, jsonObject: JSONObject) throws -> Self

This method returns an instance of the type we want - it may already exist in the database, or an empty instance was created. An existing object is found by checking some unique value against the corresponding value in the JSON. For example, the property name is "uniqueID" and the key in the JSON is "id".

This identifying value will be different depending on the object in question, and so another method from that protocol must be overridden to provide that info. This allows model objects to be written as so:

class User: JSONInstantiable {
  var uniqueID: String?
  static func matchKeys() -> (managedKey: String, jsonKey: String) {
    return ("uniqueID", "id")
  }
}

And this core method gives insertOrFindInstanceInContext all it needs to carry out its task.

Specific to Dart, the current problem I'm trying to solve is really an abuse of the language to be fair. And also, if the following construct existed, it'd really remove the need for all of this:

class X<T> implements T {}

But I understand that having a class take on the interface of its type argument also opens up an entirely different set of problems.

So, specific to the problem I'm trying to solve, and I apologize that this will be lengthy.

Model objects in Aqueduct's ORM are subclasses of ManagedObject<T>, where T is a plain Dart type that represents a database table, where property is a column in that table. ManagedObject<T> implements the dynamic storage of those properties and implements T, too.

class User extends ManagedObject<_User> implements _User {}
class _User {
  @managedPrimaryKey
  int id;
 ...
}

In code, you work with instances of User. Properties inherited from _User are stored in a Map<String, dynamic> that ManagedObject<T> manages:

var u = new User();
u.id = 1; // 1 is stored in the 'backing map' as {"id" : 1}

This dynamic storage is useful for a number of reasons, and one of those reasons is query building. A Query<T extends ManagedObject> represents a database operation. Query<T extends ManagedObject> exposes a property named where of type T. So, queries are built as such:

var q = new Query<User>()
  ..where.id = whereGreaterThan(10);

The analyzer and runtime can now verify I'm working with the appropriate columns, code completion kicks in, etc. The trick is that the 'backing map' of where - which is an instance of User - is no longer storing values, but instead storing expressions. When the query gets executed, its where clause is built from this map of expressions.

This is all dandy until I get to queries with joins. Ideally, I'd like to be able to split off a 'subquery' to represent the join. For example,

var q = new Query<User>();
Query<Item> itemQuery = q.joinOn("items"); // items is a relationship of User.

And this would be great, except that "items" is an error-prone String and itemQuery really doesn't have a type parameter because the type isn't known until runtime. And so I've gone through several different solutions, none of which are great, and the one I thought was closest to implementable was something like:

var q = new Query<User>();
var itemQuery = q.joinOn.items.query;

Here, query is a property of ManagedObject<T> that returns Query<Self> as I've envisioned it.

I also thought about something like the following:

var q = new Query<User>();
var itemQuery = new Query.join(q.joinOn.items);

And this very well might be the solution to go with for now, but now an model object has to keep a back reference to a query and I'm not sure if that creates a deferred problem. Anyway, I really appreciate the time and interest and am very much willing to accept there is another approach at a more fundamental level that I'm missing.

MichaelRFairhurst commented 7 years ago

Interesting. Static selfs are a different beast.

class Base {
  static foo() {
    self.bar();
  }
  static bar() {...}
}

class Sub {
  static bar() {...}
}
Sub.foo(); // results in Sub.bar

I'm pretty sure this is a much simpler add. Though it essentially boils down to not using static methods. In my experience statics should be avoided for these reasons.

class Base {
  foo() {
    this.bar();
  }
  bar();
}

class Sub {
  bar() {...}
}
new Sub().foo(); // results in Sub.bar

Here you can make a variety of Repository classes and a BaseRepository.

class BaseRepository<T> {
  bool matchKeys(...);
  T insertOrFindInInstanceContext(...) {...}
}

class UserRepository extends BaseRepository<User> {
  bool matchKeys(...) {...}
}

Its also worth remembering that this type of polymorphism is accomplished by passing in a this pointer behind the scenes. Static self is the same scenario, except with passing in a self pointer. In this case, if you still want to do statics but want dynamic dispatch you can pass in an object or a function into the static method.

Repository.insertOrFindInInstanceContext(User.matchKeys)

You can even leverage generic methods here:

  static T insertOrFindInInstanceContext<T extends ManagedObject>(QueryOperations<T> ops) {...}

class UserQueryOperations extends QueryOperations<User> {
  User buildFromJson(...) {...}
  bool matchKeys(...) {...}
}

In terms of your ORM example, I've seen:

class UserQuery extends Query<T> {
  Query<Item> joinItems = ...;
  ...

itsjoeconway commented 7 years ago

Yes, the class UserQuery extends Query<T> was previously what we were going with, but the problem is that it's not very discoverable/enforceable. Unless you've read deeply into documentation, it's hard thing to explain via an API reference to someone getting started. It definitely steepens the learning curve if you are solving a problem with that type of construct. It's something that can more or less be worked around, but my thoughts are that if something can be provided via some boiler plate, a tool/the language itself is likely able to provide it - and do so without error.

MichaelRFairhurst commented 7 years ago

Definitely, language features can make the same thing look different and that can be a real impact when designing an API.

If your swift API looks like User.insertOrFindInInstanceContext and you're happy with that discoverability, you could do User.newQuery() which is pretty similar.

Come to think deeper, too, adding static method calls off of a dynamic "Self" would not solve your problem entirely, because you still want the static method to return the "Self" type. Certainly adding both is a bigger request than just one or the other (granted, supporting the Self type in static contexts only would likely be much simpler than supporting the self type everywhere).

Might be worth splitting this issue into: Supporting self type, and, supporting a self pointer to perform dynamic dispatch on static calls. Though they are certainly related.

itsjoeconway commented 7 years ago

Sorry, let me clarify. The insertOrFindInstanceInContext method was just the first example I grabbed as a usage of Self in Swift. That it was static was just happenstance.

For User.newQuery to work, User would have to override newQuery to provide tighter type information that the caller can use. It can only be declared in the base class as:

Query<ManagedObject<T>> newQuery();

When invoking this on a User, the type parameter is still ManagedObject<T> - and not User. At runtime, it is User, and that's good, but the problem is the analyzer won't know that. So in the following, subquery's static type is Query<dynamic>, and therefore subquery loses the value of where being typed to T:

var q = new Query<Parent>();
var subquery = q.where.child.newQuery();
subquery.where.somePropertyOfChild = whereEqualTo(..); // The analyzer won't see that where has a property named somePropertyOfChild because where is dynamic

MichaelRFairhurst commented 7 years ago

Return types of methods should be contravariant. You should be able to override newQuery on User to return a UserQuery.

This code compiles and prints "username"

class Query {                                                                    
  static Query newQuery() {                                                      
    return new Query();                                                          
  }                                                                              
}                                                                                

class UserQuery extends Query {                                                  
  static UserQuery newQuery() {                                                  
    return new UserQuery();                                                      
  }                                                                              

  String username = "username";                                                  
}                                                                                

main() {                                                                         
  print(UserQuery.newQuery().username);                                          
}

itsjoeconway commented 7 years ago

Yes for sure, but Self removes the onus from the developer to have to override it. From your example, the following would suffice:

class Query {
 Self newQuery() { 
   return new Self();
 }
}

class UserQuery extends Query { }

new UserQuery.newQuery().username;

And I see the challenges you've mentioned throughout the thread, but this is the core of what I'm trying to get to - alleviating the burden of tightening types from the developer.

MichaelRFairhurst commented 7 years ago

Sorry, not trying to make it sound like this isn't a useful feature, just trying to help since this isn't a quick addition to the language, and I didn't want to leave you hanging while we try to figure out all the edge cases in why we could/couldn't do this, and how exactly it would work.

And for what its worth, I don't design them or really have any say, I'm just working on a static analyzer for angular and enjoy these types of discussions.

I think we now have three features that we could make to ease these APIs:

supporting the Self type
supporting static invocations off of Self dynamically
supporting construction of Self

Its also worth noting that its not clear how it would work to construct the Self type like this in dart. It would require validating that subclasses have compatible constructors either in all occasions or detecting this type of code specially.

This has been a great discussion, I mean we have three items on the docket to investigate as an option to improve dart! I think that's fantastic.

itsjoeconway commented 7 years ago

Agreed, and it was an extremely worthwhile discussion because it led me to a solution that I think is even better and I appreciate the feedback a lot. I do think something like Self would be valuable, just as another tool, but it also may not fit with some of the other stuff.

FWIW, the solution was to add the following method to Query<T>:

Query<T> joinOn<T extends ManagedObject>(T m(InstanceType x))

And now building join queries is done with the following:

var parentQuery = new Query<Parent>();
Query<Child> joinedChildQuery = parentQuery.joinOn((p) => p.child);
// Query<Child> is inferred, but added for clarity.

eernstg commented 7 years ago

Lots of stuff here, I'll just add a few comments. ;-)

@MichaelRFairhurst, it is certainly true that a genuine Self type cannot be expressed as syntactic sugar. For instance:

class C {
  bool hasMyType(other) => other is Self;
}

For an exact match, we could use other.runtimeType == this.runtimeType, but instances of Type do not support subtype tests (unless we start using reflection), and we cannot achieve the correct result for other is Self if we replace Self by any compile-time constant type. Here's another case:

class D {
  List<Self> buddies;
}
class D2 extends D {}

Here, D d = new D2(); d.buddies = <D>[]; must fail at runtime because the buddies must be a List<D2> in an instance of D2. Again, there is no compile-time constant type which will give us the correct semantics, and there is also no boolean expression using dynamicType which will do it.

Swift allows for abstracting over classes. For instance, http://en.swifter.tips/use-self/ shows a required init() method, which forces all subclasses to have an init with no arguments, such that we can safely assume that they can all create a new object with the syntax self.dynamicType(). This makes dynamic values representing classes behave similarly to metaobjects in dynamic languages, because they can be used to access the "static interface" of a class using regular method invocation.

For Dart, we have a proposal about adding metaclass objects (which essentially amounts to adding methods to instances of Type corresponding to the static methods and constructors of the corresponding classes); this would enable something very similar (self.dynamicType() could be something like this.dynamicType.new()). The reason why this hasn't happened yet is mainly that we do not have a good way to organize the types of the metaclass objects, which is again because there is no relationship between the static methods of classes (even if G is a subclass of F, knowledge about the static methods of F doesn't say anything about the static methods of G). A mechanism like required constructors or C# style where clauses could be used to enable some safe polymorphic usages, but these mechanisms are quite ad-hoc in nature, compared to the general notion of subtyping.

It would be quite easy to do another thing, though: We could allow methods to return types involving Self (or other covariant types) by annotating them (say, with @redefineInSubclass). The idea is that the method is type checked for conformance to the standard type rules in the class where it is declared (as opposed to the normal approach where a method must be type correct both where it is declared and in all potential classes where it could be inherited). In all subclasses where that method is inherited it is rechecked relative to that subclass, and if it fails to be type correct then it is a compile time warning (in Dart 2.0 it would be an error, following the usual policy adjustment for 2.0), and the developer would then have to redefine the method such that it does type check. This mechanism is more powerful than the metaclass approach in some ways, and less powerful in others, but both could be used to write such things as the clone method.

(Edited Jan 31 2017: Fixing some typos and clarifying the part about metaclasses.)

alexmarkley commented 4 years ago

Are there any updates on this use case? I'm modeling an object hierarchy in both Dart and TypeScript, and Typescript's polymorphic 'this' type is much cleaner than what I'm currently doing for the dart implementation:

abstract class Base<X extends Base<X>> {
  X requiredMethodForBase();
}

abstract class BaseWidget<X extends BaseWidget<X>> extends Base<X> {
  X requiredMethodForWidget();
}

abstract class BaseList<X extends BaseList<X, Y>, Y extends BaseWidget<Y>> extends Base<X> {
  List<Y> listOfWidgets = [];
  X requiredMethodForList();
  X requiredMethodForBase() {
    // has a default implementation
    return this;
  }
}

So then implementing a BaseWidget or BaseList requires more of the same. Something like:

class SomeConcreteList extends BaseList<SomeConcreteList, SomeConcreteWidget> {
  SomeConcreteList requiredMethodForList() {
    // do something with this.listOfWidgets
    return this;
  }
}

As you can imagine, the type declarations are much easier on the eyes in the Typescript implementation of the same hierarchy. Would love to have something similar in Dart.

eernstg commented 4 years ago

There are no updates on adding a This type to Dart and implementing the feature, but I still think that it could be a useful feature to have.

We'd want to do it in a statically safe manner, though. It should be noted that the section on Typescript's polymorphic 'this' type refers to the Wikipedia page on Fluent interfaces, and that page lists violations of static type safety among the problems with such interfaces. Basically, This is an existential type (it is known to be a subtype of the enclosing class, and two occurrences of This are known to be the same type given that they denote the dynamic type of the same object), but it is not known statically which subtype of the enclosing class. It takes a smart type system to keep it safe. Scala uses path dependent types, an approach which is also known as family polymorphism or virtual classes. In the TypeScript example it would be sufficient to require that methods returning this are overridden in every subtype of BasicCalculator, but the ability to require that is not a standard feature of any common OO type system.