ceylon / ceylon-spec

DEPRECATED
Apache License 2.0
108 stars 34 forks source link

higher order generics #754

Open gavinking opened 11 years ago

gavinking commented 11 years ago

We've often discussed this feature, type constructor parameterization, or kinds, or whatever you want to call it, which boils down to, by analogy with first-class/higher order functions at the value level, adding:

That is, I can define generic types and functions which are parameterized not only by types, but also by type constructors. And I can pass a reference to a type constructor as an argument to such a generic type or function.

I now have an implementation of higher order generics in the typeconstructors branch of the typechecker project. An outline of my approach:

There's one big limitation remaining in the current implementation: I don't have type constructor arg inference, so I have to write shit like this:

List<String>(List<Object>) listToString = toString<@List>(listFunctor);

Instead of just:

List<String>(List<Object>) listToString = toString(listFunctor);

I assume that this problem is tractable.

Otherwise, the implementation was surprisingly easy and straightforward though surely there are still a couple of holes.

Now, the only reason I've even bothered playing around with this shit at this stage is that we have an actual use for it in the metamodel API. We would like to be able to have typesafe model objects for generic declarations. That is, it would be nice to be able to write:

GenericClass<@List> listGenClass = `@List`;
Class<List<String>> stringListClass = listGenClass(`String`);

And have that be all totally typesafe.

UPDATE: Note, we no longer propose to use this stuff for the metamodel!

What I'm not proposing at all is to go and make Iterable and its subtypes into Functors. It would be possible to do this someday, if we decide it's truly useful, but I definitely don't want to do it in Ceylon 1.x.

So, we need to decide if we're going to make this a part of the language. There are strong arguments on both sides, which I'll let others advocate, because I simply can't seem to make up my mind.

gavinking commented 11 years ago

Now, the only reason I've even bothered playing around with this shit at this stage is that we have an actual use for it in the metamodel API. We would like to be able to have typesafe model objects for generic declarations.

Now for the bad news: unfortunately, with the machinery that is there now, I can't yet write a totally generic GenericClass interface. I would need Scala-style GenericClass1, GenericClass2, GenericClass3 up to the blessed GenericClass22, beyond which point are monsters. I think you all know how I feel about that approach ;)

So we would need a way to abstract over type constructors with diff-length type parameter lists. (Analogously to how Callable uses a tuple type to abstract over diff-length value parameter lists.) I don't know how to do this yet.

gavinking commented 11 years ago

I don't know how to do this yet.

So I think there's two related solutions that might work here. The first would be to introduce a type-level spread operator * that spreads a tuple type over the type parameters of a type constructor, just like how our existing spread operator spreads a tuple over value parameters:

shared interface GenericClass<@TheClass, Bounds> 
        given Bounds satisfies Anything[] 
        given TheClass<*Bounds> {
    shared formal Class<TheClass<*Args>> apply<Args>() 
            given Args satisfies Bounds;
}

An alternative approach would be to define that @Map is a type constructor in "cartesian product" form, i.e. a function that accepts a tuple type like [String,Object] and produces the corresponding instantiation, so @Map<[String,Object]> is the type Map<String,Object>. That is, we suck the definition of spreading inside the definition of this type constructor:

shared interface GenericClass<@TheClass, Bounds> 
        given Bounds satisfies Anything[] 
        given @TheClass<Bounds> {
    shared formal Class<@TheClass<Args>> apply<Args>() 
            given Args satisfies Bounds;
}

If we went down that path, it might be interesting to define the same thing at the value level, i.e. that @plus<Float> is a function of type Float([Float,Float]). The @ syntax would become some kind of dual of the spread operator *.

Of the two approaches the second is probably slightly simpler to define/implement, but I suppose it's also a whole new syntactic thing that we don't currently have at the value level. So I guess the type-level spread operator is the smallest change we could make to the language that would let us solve the problem of defining GenericClass.

P.S. Note that there is a new kind of type constraint in the above code examples:

given TheClass<*Bounds>

or:

given @TheClass<Bounds>

says that the types in the tuple type Bounds must be assignable to the type parameters of TheClass. So

GenericClass<@Map,[String,Object]>>

is a well-formed type expression, but these type expressions are not:

GenericClass<@Map,[Object]>>            //error: wrong number of type args to Map
GenericClass<@Map,[String,Anything]>>   //error: type arg Anything not assignable to upper bound on Map.Item
FroMage commented 11 years ago

Syntactically and intuitively I'd say I prefer to be able to spread type arguments over the second alternative, but I fear the consequences that could have…

gavinking commented 11 years ago

Note: the current implementation of higher-order generics has the following "hole":

T<String,Anything> bad<@T>() given T<X,Y> => nothing;

value val = bad<@Map>(); //val has illegal type Map<String,Anything>

However, in light of #535, I guess this is not actually a real problem, or at least it won't be if/when we address #535.

simonthum commented 11 years ago

At the possible expense of looking stupid: What does this whole thing buy you over a ConstructingCallable< ...> extends Callable<...> which every type class implements and maps to its initalizer?

gavinking commented 11 years ago

@simonthum without higher-order generics, it's impossible to define a GenericClass type that acts as the metamodel object for parameterized classes, with an apply() operation that accepts a list of type arguments and produces the Class metamodel object for the instantiation of the generic type.

gavinking commented 11 years ago

Hrm, with the machinery developed so far, I guess it's still impossible to represent generic functions at the meta level. To solve this problem we would also need something like "anonymous" type constructors.

For example, the metatype of the function plus is:

GenericFunction<<T>=>T,<T>=>[T,T],[Summable<Anything>]>

Where GenericFunction is defined like this:

shared interface GenericFunction<@Type,@Arguments,Bounds> 
        given Bounds satisfies Anything[]
        given Type<*Bounds>
        given Arguments<*Bounds> {
    shared Function<Type<*Args>,Arguments<*Args>> apply<Args>() 
            given Args satisfies Bounds;
}

Interestingly, if we solved this problem, we might also be able to assign a denoteable type to generic function references. That is, a function reference like sum, which is currently illegal (we force you to write sum<Float>) could have the type GenericCallable<<T>=>T,<T>=>[T,T]>. I'm not sure if that quite qualifies as higher rank polymorphism, but it's definitely close, if I've got my definitions straight.

All this is technically very tractable, but I'm increasingly inclined to think it simply doesn't belong in this language.

gavinking commented 11 years ago

OTOH, since arbitrary "anonymous" type constructors are going to let you write down undecidable things really easily, the other alternative would be to have the type constructors be derived from the signature of the method via a special syntax, for example:

GenericFunction<@plus,plus(@),[Summable<Anything>]>

Where @plus is the return type constructor, i.e. <Value>=>Value here in this case, and plus(@) is the argument types constructor, i.e. <Value>=>[Value,Value] in this case.

Or whatever. It's very difficult to imagine a syntax for this which would be anything other than totally baffling to most programmers.

gavinking commented 11 years ago

After extensive thinking on this topic, I think I've decided that there us no real value to the typesafe apply() method. If you have a GenericClass<@T> and want to apply it to the types X and Y, you can just as easily write:

 `T<X,Y>`

To get the Class<T<X,Y>> you need. The same is true for generic functions. So I think the whole metamodel rationale for introducing higher order generics is a distraction, and we won't need all that crazy stuff I've been speculating about above.

Of course, the question remains: do we need "basic" type constructor polymorphism, as already implemented today, for some other reason?

gavinking commented 10 years ago

So if we ever do decide to add this stuff, we will have to decide what to do with type aliases. Given:

alias Labeled<Value> => String->Value;
alias Numbered<Value> => Integer->Value;
alias E<Key,Value> => Key->Value;

We would need to either:

  1. prohibit type constructor references for aliases like @Labeled, @Numbered, and @E, or
  2. be able to reason about their equivalence, and determine that @E==@Entry, but @Labeled!=@Numbered and @Labeled!=@Entry.

Surely the easy approach 1 would be reasonable, but it would rule out some genuinely useful things.

But what's interesting about approach 2 is that if we went down that path, it's simply not a big stretch at all to "anonymous" type constructors like this:

<Value> => String->Value

This is, quite clearly, the same type constructor as @Labeled above.

Now this is clearly verging on rank-2 polymorphism! You could write:

shared interface GenericCallable2<@Type, @Args> 
        given Type<X,Y> given Args<X,Y> {
    shared Callable<Type<X,Y>,Args<X,Y>> apply<X,Y>();
}

to represent functions with two unbounded type parameters. And then the type of a reference to this function:

List<X|Y> fun<X,Y>(List<X> xs, List<Y> ys) 

would be:

GenericCallable2<<X,Y>=>List<X|Y>,<X,Y>=>[List<X>,List<Y>]>

Of course there's still a couple of things missing here:

  1. the ability to abstract over the arity of the type parameter list, and
  2. the ability to capture bounds.

So I'm not actually seriously proposing to try and tackle rank-2 polymorphism, just trying to get some insight by showing how it fits in.

RossTate commented 10 years ago

An update: with shapes, we figured out we can do subtyping and joins even with higher-order generics with type lambdas. The question becomes, then, do you want higher-order type arguments to be inferred? I don't know how to solve that problem, unfortunately.

gavinking commented 10 years ago

higher-order generics with type lambdas

A "type lambda" is my "anonymous type constructor", right?

do you want higher-order type arguments to be inferred?

I'm inclined to think that higher-order generics is not very useful without some sort of type constructor inference. I don't believe that people want to write code with lots of type constructor arguments in it. Hell, one of the major things that our type system tries to move away from is the need to write type arguments all over the place...

gavinking commented 10 years ago

P.S. Ross please note that while I do think that it's very interesting to speculate about stuff like:

 GenericCallable2<<X,Y>=>List<X|Y>,<X,Y>=>[List<X>,List<Y>]>

I believe that there is almost nobody out there who actually wants to see something like this in real code. If I were to go about designing a language with the deliberate goal of scaring off newbies, I would make damn sure to include the above in it ;-)

RossTate commented 10 years ago

Heh, I ain't disagreeing. Just updating you with the facts (and yes, anonymous type constructors = type lambdas). As for inference, the only way I can see that working is if we disallow type lambdas and somehow declare what generic classes/interfaces can be used as higher-order generics and with respect to which parameters.

gavinking commented 10 years ago

@RossTate Hrm, yeah, 5 mins thinking about the problem makes me realize that, since the union/intersection tricks aren't applicable here, you can pretty easily get ambiguities. I guess this just doesn't arise with ML-like languages because they don't have subtyping. How the hell does Scala handle this problem?

RossTate commented 10 years ago

Admittedly, I'm not sure how Scala handles the problem cuz it doesn't seem to be well spec'd. I have seen examples with type lambdas (though sometimes indirectly encoded since they seem to be a feature in progress), but I haven't seen anything with higher-kinded-type-argument inference. Regardless, Scala has explicitly stated that it doesn't worry about decidability, so I doubt that their approach to this would really fit with your design philosophy.

gavinking commented 10 years ago

Is it covered by the Scala spec?

RossTate commented 10 years ago

Not from what I could tell.

gavinking commented 10 years ago

Hrmph. Now that's definitely not a good sign.

pthariensflame commented 10 years ago

FWIW, there is a reason even full GHC-extended Haskell doesn't have (unrestricted) type lambdas; they guarantee a Turing-complete type system, even just on their own, unless they are somehow kind-checked in a more sophisticated than usual way (see, e.g., Agda for an example of this latter type of technique).

gavinking commented 10 years ago

Hrm so does that mean parameterization by type alias constructors would also tip you over the edge into turing-completeness? It would, right?

pthariensflame commented 10 years ago

Yup. Type aliases as fully-fledged type constructors (as opposed to something like GHC's LiberalTypeSynonyms system) are exactly equivalent to (in fact, in essence, they are) unrestricted type lambdas.

gavinking commented 10 years ago

@pthariensflame So I had understood @RossTate's slightly cryptic comment above:

An update: with shapes, we figured out we can do subtyping and joins even with higher-order generics with type lambdas.

To be suggesting that with his shapes vs types system, type lambdas turn out to be decidable.

Ross, could you clarify?

pthariensflame commented 10 years ago

I can't find anything on these "shapes" by searching (probably because it's such a generic name :) ). I'll wait for @RossTate to clarify, as well. In the meantime, it's worth noting that there is a very simple restriction on type lambdas that allows decidability (I referred to it earlier as "GHC's LiberalTypeSynonyms system"): A type alias may only be used without parameters if it is being passed to another type alias. The formal way to state this restriction is that an injective higher-order type constructor may only take other injective type constructors as its higher-order parameters (in Ceylon, this is especially convenient to interpret if you mentally replace "injective" with "reified"; they're not really the same underlying concept at all, but, within Ceylon specifically, all reified type constructors are guaranteed, by definition, to be injective, and the only potentially injective non-reified type constructors in Ceylon are, I think, undecidably so).

RossTate commented 10 years ago

The shapes concept (currently in submission but not yet published, though informal discussion can be found in #816) significantly restricts recursion in inheritance. This effectively restricts the computational power of type lambdas enough to ensure that they always terminate.

pthariensflame commented 10 years ago

@RossTate So it seems like a "shape" is just the OO-ish incarnation of a Constraint (á la modern GHC-extended Haskell, or my own (WIP) language Spellcode). Is that an accurate assessment?

RossTate commented 10 years ago

Technical differences aside, the OO-ish incarnation of a Haskell's constraints is bounded polymorphism (which is actually the much older concept). They both restrict a type parameter and with that restriction provide additional abilities to that type parameter. What shapes are is the recognition that some classes/interfaces are only used for constraints, and furthermore essentially all uses of recursive inheritance are only for the sake of constraints.

pthariensflame commented 10 years ago

Ok, I think I get it, then. In that case, why implement it as complex restriction at all? Why not have a separate declaration type with an appropriate initial keyword (shape or bound or something) that would handle the whole pattern for you automatically, and then just lock all the other declaration types away from any type-level recursion at all?

RossTate commented 10 years ago

That can be done as well. It's just easier to retrofit a restriction onto existing code than impose a change in syntax.

gavinking commented 9 years ago

I recently updated this work to current master, it is available in the branch typeconstructors2. To recap the limitations of the current implementation:

  1. No type constructor inference (we have basically no clue how that should work).
  2. No upper bounds on type constructors.
  3. No upper bounds on type constructor type parameters.

For 2/3, what I should be able to do is write stuff like:

void fun<@X>(X<String> strings) 
        given X<T> satisfies Iterable<T> 
        given T satisfies Object {
    for (s in strings) {}
}

And those type constraints should be checked at the call side.

gavinking commented 9 years ago

I have made huge progress on this today. Problems 1, 2, and 3 above are solved, along with several other issues that came up.

shared interface Functor<out Element,out @Container> 
        given Container<Element> {
    shared formal 
    Container<Result> fmap<Result>
            (Result(Element) fun);
}

shared class ListFunctor<Element>(List<Element> list)
        satisfies Functor<Element,@List> {
    shared actual 
    List<Result> fmap<Result>
            (Result(Element) fun)
            => list.collect(fun);
}

Container<String> toString<Element,@Container>
        (Functor<Element,@Container> functor)
        given Container<Element>
        => functor.fmap((e)=> e?.string else "<null>");

void test() {
    List<String> strings = 
            toString(ListFunctor([1, 2, 3])); //@Container is inferred to be @List!

I even solved the problem of forming principal instantiations of types like Functor<Integer,@Sequence> | Functor<Integer,@Singleton>, by introducing variance for higher-order generic types, along with unions and intersections of type constructors. For example, Functor<Integer,@Sequence|@Singleton>.

Thus, | and & introduce a limited form of subtyping for type constructors. It's quite limited, since @Sequence is not considered a subtype of @List, and I'm not sure we would want that, which means that Functor<Integer,@Sequence|@List> can't be canonicalized to Functor<Integer,@List>, but at least it is powerful enough to let us write things like this:

class F1<T>(T t) 
        satisfies Functor<T, @F1> {
    shared actual default 
    F1<Result> fmap<Result>(Result(T) fun)
            given Result satisfies Object 
            => F1(fun(t));
    string => "[``...``]";
}

class F2<T>(T t) 
        extends F1<T>(t) 
        satisfies Functor<T, @F2> {
    shared actual 
    F2<Result> fmap<Result>(Result(T) fun)
            given Result satisfies Object 
            => F2(fun(t));
}

F2<Integer> sizes = F2("hello").fmap(String.size);

Which is extremely useful! Note that here, F2 is a Functor<T, @F1&@F2>, according to the usual rules of principal instantiation inheritance in Ceylon. Nice, huh?

gavinking commented 9 years ago

Oh, and note that all this stuff actually compiles and runs on the JavaScript backend! Try it out!

quintesse commented 9 years ago

Dammit, and here I was hoping that with Ceylon I'd never have to learn (and worse, understand) about Functors, Monoids and Monads ^^

gavinking commented 9 years ago

@quintesse To be clear, no-one is proposing to have for our container types implement Functor. We don't need it. We're talking about Functor because it's the simplest, most well-known example of a higher-order generic type.

Also, monoids have nothing to do with higher-order generics, AFAIK. To represent Monoid you need type classes. This is something we do need, so that we can correctly write down the signature of the sum() and product() functions. But that's #899, not this issue.

gavinking commented 9 years ago

Also, @quintesse, when I explain something like Functor or Monoid to you, you will realize it's something totally simple and trivial and you will wonder what all the fuss is about and why the fuck 1000 Haskell and Scala bloggers were unable to explain such a totally trivial and obvious thing without making it look scary and complicated.

gavinking commented 9 years ago

Well, this is pretty much done now. My major remaining question relates to syntax. Would it be better to write:

The advantage of this syntax would be that it's a bit more elegant for union/intersection type constructors, and that it would extend very naturally to more complex type constructors, for example:

<T> => Iterable<T,Nothing>
<T> => Map<String,T>
<T> => {T+}

Which is stuff we can't currently handle. Also it avoids the use of a cryptic @ character.

The disadvantage is it's more verbose, and requires more work in the typechecker to validate that the type constructor expression is "sane".

If we went down that path, we would get code like:

shared interface Functor<out Element,
            <T> => Container<T>> {
    shared formal 
    Container<Result> fmap<Result>
            (Result(Element) fun);
    shared void accept(Container<Element> cont) {}
}

shared class ListFunctor<out Element>
        (List<Element> list)
    satisfies Functor<Element,
        <T> => List<T>> {
    shared actual 
    List<Result> fmap<Result>
            (Result(Element) fun)
            => list.collect(fun);
}
gavinking commented 9 years ago

Oh, there's another advantage. We would not need the given clause for a type constructor parameter, except when it has constraints—just like for ordinary type parameters.

gavinking commented 9 years ago

Scratch that. If we do #791, then we can use the most natural syntax; the one I started out with, i.e.

shared interface Functor<out Element, out Container> 
        given Container<Elem> {
    shared formal 
    Container<Result> fmap<Result>
            (Result(Element) fun);
    shared void accept(Container<Element> cont) {}
}

shared class ListFunctor<out Element>
        (List<Element> list)
    satisfies Functor<Element, List> {
    shared actual 
    List<Result> fmap<Result>
            (Result(Element) fun)
            => list.collect(fun);
}

And, y'know what: it's less explicit, but I think it makes higher-order generics look much less scary, and much more ceylonic.

FroMage commented 9 years ago

So all that's missing is the ability to reuse Tuple to abstract over higher-kind types of variable arity, right? ;)

gavinking commented 9 years ago

Hahaha. You're funny :-)

FroMage commented 9 years ago

What? I thought this was a pissing contest? ;)

FroMage commented 9 years ago

Isn't it the case that all we'd need is a predefined type to describe what Foo<X,Y...> expands to, like Foo(X,Y...) expands to Callable<Foo,[X,Y]>?

FroMage commented 9 years ago

Then, if Generic is that predefined type, given Container<Elem> means given Container satisfies Generic<[Elem]> and you could even say given Container satisfies Generic<Elems> given Elems satisfies Anytyhing[] ;)

quintesse commented 9 years ago

I must admit I've lost track of how this all works. How is the last example equivalent to what came before?

FroMage commented 9 years ago

It's the given Container<Elem> which declares that Container is a higher-kind type taking one type param.

gavinking commented 9 years ago

@FroMage I don't think it's that simple, no.

quintesse commented 9 years ago

Ok, but that appeared in the first example as well:

shared interface Functor<out Element,out @Container> 
        given Container<Element> {
FroMage commented 9 years ago

Sure, but the @ is superfluous and can be inferred from the given constraint.

FroMage commented 9 years ago

@FroMage I don't think it's that simple, no.

That is more or less what you suggested in the first comments.

quintesse commented 9 years ago

Well it sure makes the syntax nicer to look at. For me perhaps it makes it more difficult to recognize there's any kind of higher-order generics going on at all.