eclipse-archived / ceylon

The Ceylon compiler, language module, and command line tools
http://ceylon-lang.org
Apache License 2.0
397 stars 62 forks source link

type classes #4005

Open CeylonMigrationBot opened 10 years ago

CeylonMigrationBot commented 10 years ago

[@gavinking] Scope

The purpose of this proposal is to allow the definition of operations which:

The terminology is not intended to imply that this feature is exactly the same as Haskell's type classes. In particular, this is not a proposal for introductions. A type class in this proposal is specified when the type is defined, not where it is used.

Concept

I would like to be able define a generic sum() function that accepts {Element*}, rather than {Element+} as it does at present. This means we would need to make Summable into a true monoid rather than just a semigroup. We would like to be able to write:

Value sum<Value>({Value*} values) 
        given Value satisfies Summable<Value> {
    variable value sum = Value.zero;
    for (val in values) {
        sum+=val;
    }
    return sum;
}

Ceylon is a perfect language for implementing this kind of thing, since we already have reified generics, so we already pass an object to the type parameter Value when we call sum().

So, conceptually, zero is a member of the metatype of Value, i.e. of the object Value. A very strawman syntax for defining such a setup would be:

interface Summable<Other> of Other
        is Monoid<Other>
        given Other satisfies Summable<Other> {
    shared formal Other plus(Other other);
}
interface Monoid<Other> {
    shared formal Other zero;
}

Now, the constraint given Value satisfies Summable<Value> would additionally imply that Value is Monoid<Value>.

Alternatively, we could move the type class constraint is Monoid<Other> from the definition of Summable to the definition of sum(), i.e.

Value sum<Value>({Value*} values) 
        given Value is Monoid<Value> satisfies Summable<Value> { ... }

but that seems a bit less convenient and would result in more repetition. Whatever.

Anyway, so for a concrete class that actually implements Summable/Monoid, we would wind up with, approximately:

class String() 
        satisfies Summable<String> {
    .... 
}
object String is Monoid<String> {
    zero => "";
}

Not sure what to do to clean up the syntax here. It could also, potentially be a separate object declaration.

[Migrated from ceylon/ceylon-spec#899]

CeylonMigrationBot commented 10 years ago

[@RossTate] Sounds like you want

interface Monoid<Other> {
    shared static formal Other zero;
    shared formal Other plus(Other other);
}

class String satisfies Monoid<String> {
    shared static actual String zero = "";
    shared actual String plus(String string) {...}
}

I know you wanted to get rid of statics, but that's what zero is. We've just made it useful by allowing it to be virtual. Note that there are subtleties here, but I'm offering the basics first.

CeylonMigrationBot commented 10 years ago

[@gavinking] I don't want static because the body of a class is an executable block of code. I do not want to open up the incredible can of worms of why can't I write this:

class String satisfies Monoid<String> {
    shared static actual String zero;
    zero = "";
    shared actual String plus(String string) {...}
}

Remember: static doesn't come on its own. Opening that door lets in the whole nastiness of static initializers.

No, much better to keep your "static"s in a toplevel object which has a separate welldefined initializer.

CeylonMigrationBot commented 10 years ago

[@RossTate] Ah, I hadn't thought about initialization.

CeylonMigrationBot commented 10 years ago

[@RossTate] Another problem is that you may want to change the monoid being used from the default one. My student is working on this via our shapes stuff. So with that (and feel free to change the syntax):

shape Monoid<Other> {
    shared static formal Other zero;
    shared formal Other sum(Other other);
}

class Rational(Integer above, Integer below) satisfies Monoid<Rational> {
    ...
} fills Monoid<Rational> {
    shared static actual Rational zero = Rational(0, 1);
    shared actual Rational plus(Rational other) = Rational(above*other.below + other.above*below, below*other.below);
}

object multiple satisfies Rational.Monoid<Rational> {
    shared Rational zero = Rational(1, 1);
    shared actual Rational this.sum(Rational other) = Rational(above*other.above, below*other.below);
}
CeylonMigrationBot commented 10 years ago

[@gavinking] Interestingly, the following point of view perhaps makes more sense:

interface Monoid<Value> {
    shared formal Value zero;
    shared formal Value add(Value x, Value y);
}
interface Summable<Other> of Other
        given Other is Monoid<Other> satisfies Summable<Other> {
    shared formal Other plus(Other other) => Other.add(this,other);
}

According to this syntax, you can apply the constraint when you define Summable, or when you define sum(). It seems a little more flexible. It would not stop us from having given satisfies Summable<Value> imply Value is Monoid<Value> in the body of sum(), so I think it's more natural.

I'm perfectly happy with this syntax. The bit that has me stumped is how to define the implementation of Monoid for String. Does this feel natural to you guys:

class String() satisfies Summable<String> {
    ...
}

class String is Monoid<String> {
    zero => "";
    add(String x, String y) => String(x.chain(y));
}

The idea is we're kinda splitting the definition of String into two chunks a bit like what we do with get/set pairs.

CeylonMigrationBot commented 10 years ago

[@gavinking]

Another problem is that you may want to change the monoid being used from the default one

I know but I would prefer to not bite that one off, at least not now. Which is why I clarified from the start that this was not a proposal for introductions. Yesyes, I know perfectly well that there are actually two very important monoids for Integer and Float but I have never actually seen people in practice writing code that abstracts over multiplication vs addition.

However, upon reflection, I'm inclined to go there, at least for the purposes of this discussion, since it might help us clarify what the syntax should be. So let's contemplate that we might possibly want to introduce this at some future point.

Well, my proposed syntax which separates the definition of the type class from the definition of the type is actually very easily capable of representing that, and this is another argument in favor of my approach. We could easily add something like this in future:

class Float() satisfies Numeric<Float> {
    ...
}

class Float is Monoid<Float> {
    zero => 0.0;
    sum(String x, String y) => x+y;
}
class FloatMultiplication for Float is Monoid<Float> {
    zero => 1.0;
    sum(String x, String y) => x*y;
}

value sum = sum(floats);
value product = sum<FloatMultiplication>(floats);

To be clear, I'm definitely not advocating that we introduce this now. I'm just trying to use it as a strawman to help motivate the syntax.

CeylonMigrationBot commented 10 years ago

[@RossTate] Yep, makes sense. The one thing I don't like about this syntax is having two separate declarations for Float. Maybe midde-ground that with mine:

class Float() satisfies Numeric<Float> {
    ...
} is Monoid<Float> {
    zero => 0.0;
    sum(String x, String y) => x+y;
}
object multiplication for Float is Monoid<Float> {
    zero => 1.0;
    sum(String x, String y) => x*y;
}

value sum = sum(floats);
value product = sum<Float using multiplication>(floats);
CeylonMigrationBot commented 10 years ago

[@gavinking] This did help me a little.

By analogy with:

interface Foo 
        of foo 
        satisfies Singleton {
    ...
}

Where we have the pattern:

interface <a type> 
        of <a value of that type> 
        satisfies <a supertype of the type>

This, we would have:

interface FloatMonoid 
        of Float 
        satisfies Monoid<Float> {
    ...
}

This to me looks like a reasonable starting point for arguing over the syntax. WDYT?

CeylonMigrationBot commented 10 years ago

[@gavinking]

Yep, makes sense. The one thing I don't like about this syntax is having two separate declarations for Float.

I would have the same concern, if it weren't for the fact that we have already established that pattern with getter/setter declarations:

Foo foo { return ... ; }
assign foo { ... ; }

And, hell, you could even argue that the syntax for enumerated types is similar:

class Boolean() of true | false {}
object true extends Boolean() {}
object false extends Boolean() {}

So I think of this as just an extension of that pattern.

CeylonMigrationBot commented 10 years ago

[@pthariensflame] If Ceylon gets extension methods, then "introductions" will never actually be needed.

CeylonMigrationBot commented 10 years ago

[@gavinking] OK, after a couple of iterations I arrived at this:

    // a type class:
    interface Monoid<Value> {
        shared formal Value zero;
        shared formal Value add(Value x, Value y);
    }

    // an interface constrained to types 
    // that satisfy the type class 
    interface Summable<Other> of Other
            is Monoid<Other>
            given Other satisfies Summable<Other> {
        shared formal Other plus(Other other) 
                => Other.add(this,other);
    }

    // a concrete class that satisfies both 
    // the type class and the interface
    class String() 
            satisfies Summable<String>
            is StringMonoid {
        ...
    }

    //an implementation of the type class
    //for this concrete class
    interface StringMonoid
            satisfies Monoid<String> {
        zero => "";
        add(String x, String y) => String(x.chain(y));
    }

I'm actually pretty happy with this! It preserves the natural meanings of the keywords and avoids introducing any new kinds of declaration. The only really new syntax is the is clause in class/interface declarations.

CeylonMigrationBot commented 10 years ago

[@RossTate] @pthariensflame: Extension methods and introductions or type classes are very different from each other.

@gavinking: Ok, I see you're reasoning about the separation. Then the one thing I don't like is having the overridden implementation be a type. If it's a type, it can be used in all sorts of ways you don't expect. So I think it's better to view it as an object. Conceptually, Monoid<Float> for Float is an interface, and the object implements that interface. Note that your new solution still seems to be mixing up types and objects in a confusing manner.

CeylonMigrationBot commented 10 years ago

[@gavinking]

If Ceylon gets extension methods, then "introductions" will never actually be needed.

I don't think that's right. Extension methods don't let you introduce a new type, they just let you call a function with a diffierent syntax.

CeylonMigrationBot commented 10 years ago

[@pthariensflame] @RossTate I must be misunderstanding what "introductions" are, then. Haskell's type classes (exempting extensions) are exactly covered by this proposal. I thought an introduction was an injection of a method into a type by a type class, which doesn't exist in Haskell (because methods don't exist, at least not in that sense). Am I wrong here? If not, wouldn't an extension method trivially delegating to a type-classed function produce exactly the same effect?

CeylonMigrationBot commented 10 years ago

[@gavinking]

Then the one thing I don't like is having the overridden implementation be a type. If it's a type, it can be used in all sorts of ways you don't expect.

This is certainly the least-motivated thing in what I have right now. The thing that was pushing me in the direction of interfaces here is that if we have a type like String which satisfies multiple type classes, then it has a metatype which is conceptually an intersection of type class types. We already know how to deal with mixing together interfaces, and have well-defined rules surrounding what is meaningful and what is not. Mixing together objects is a whole new thing that I kinda don't want to grapple with.

Does that make sense?

CeylonMigrationBot commented 10 years ago

[@pthariensflame] @gavinking You're not "mixing together objects", you just happen to have more than one instance of a type, which you definitely do already know how to deal with.

CeylonMigrationBot commented 10 years ago

[@gavinking]

I though an introduction was an injection of a method into a type by a type class, which doesn't exist in Haskell (because methods don't exist, at least not in that sense). Am I wrong here? If not, wouldn't an extension method trivially delegating to a type-classed function produce exactly the same effect?

An introduction (terminology I first learned in the context of AOP, but it might be older) means the ability to retrospectively add a whole supertype to a type. It sounds superficially like the sort of thing you can do with typeclasses in Haskell but actually it's broken in various ways.

CeylonMigrationBot commented 10 years ago

[@gavinking]

You're not "mixing together objects", you just happen to have more than one instance of a type, which you definitely do already know how to deal with.

I mean if I have:

class Float() is Monoid<Float> & Ring<Float> & Eq<Float> { ... }

Then the metatype, that is, the type of the expression Float is:

Float() & Monoid<Float> & Ring<Float> & Eq<Float>

I want to, from an implementation point of view, produce an actual value that implements this type, somewhere under the covers.

CeylonMigrationBot commented 10 years ago

[@pthariensflame] Then you want (roughly) what this Haskell code gives you:

{-# LANGUAGE ConstraintKinds, KindSignatures #-}
import Data.Constraint

type DecidableMonoidalRing (a :: *) = ((Monoid a, Ring a, Eq a) :: Constraint)

evidence :: Dict (DecidableMonoidalRing Float)
evidence = Dict

This is, in the proposed Ceylon syntax, just:

alias DecidableMonoidalRing<A> = Monoid<A> & Ring<A> & Eq<A>;

value DecidableMonoidalRing<Float> evidence = Float;

That is, you're just letting the metatype implement all the (meta)interfaces at once. This is just an instance (heh) of the existing confusion with the intersection-like syntax in satisfies clauses. There's no actual problem.

CeylonMigrationBot commented 10 years ago

[@RossTate] @pthariensflame: Extension methods are purely syntactic sugar for calling top-level/static functions/methods.

@gavinking: Okay, I think I see what you're doing. You're taking advantage of the fact that your StringMonoid interface has no undefined methods, so it can be implemented by the String meta-object without any additional specification and it can be mixed with other interfaces being implemented by that String meta-object. If StringMonoid were a class, that wouldn't be possible. It's an interesting exploitation of the existing features. Did I understand you correctly?

CeylonMigrationBot commented 10 years ago

[@pthariensflame] @RossTate Yes, but there's nothing stopping an extension method from delegating to a type class method on the metatype. For example:

public interface Monoid<Value> {
    Value zero;
    Value plus(Value x, Value y);
}

Value plusExtension<Value>(Value this, Value other) given Value is Monoid<Value>
    => Value.plus(this, other);

This gets you pretty much all of the (additional-to-type-classes) benefits of introductions, without basically any of the faults.

CeylonMigrationBot commented 10 years ago

[@RossTate] Huh? You seem to be assuming a way to make a type's metaobject implement an interface, which is what we're trying to figure out here.

CeylonMigrationBot commented 10 years ago

[@gavinking] Please keep in mind that I have the following ulterior goals here:

This needs to be a bit more than syntax sugar, but I want it to be not much more.

CeylonMigrationBot commented 10 years ago

[@pthariensflame] @RossTate So you just declare that that's what happens. You can take a (very minor) leaf from universe-polymorphic type theory here: The metatype of a type is simply a type-level code for a kind. That is, it is a reified kind, in the same sense in which Ceylon already features reified types. Subkinding becomes involved, but that is both conceptually and implementationally no different to the subtyping that Ceylon already knows how to handle. An is clause, of the form @gavinking is proposing, is simply a "lifting" of a satisfies clause to the kind level.

CeylonMigrationBot commented 10 years ago

[@gavinking] Rust, one of the up'n'coming languages for systems programming, has a nice system of type classes (called "traits").

http://pcwalton.github.io/blog/2012/08/08/a-gentle-introduction-to-traits-in-rust/

CeylonMigrationBot commented 10 years ago

[@RossTate] From what's in that article, traits are just interfaces. If David Herman's at POPL next week, I'll double-check with him.

CeylonMigrationBot commented 10 years ago

[@gavinking]

From what's in that article, traits are just interfaces.

Ah ... I'm assuming that the reason they have &self as an explicit parameter to methods is that you can have a method without a &self parameter. I could be wrong.

CeylonMigrationBot commented 10 years ago

[@gavinking] Oh. Looks like they plan it but don't have it yet. Well, it will fit really nicely in with the rest of what they have, when they add it.

CeylonMigrationBot commented 9 years ago

[@Zambonifofex] @GavinKing Sorry, but I'm slightly confused by your example... What does given Other Summable<Other> mean? I'm imagining it's a typo?... And, on StringMonoid, why is zero a Float? Shouldn't it be a String? I really like the is syntax, by the way. I feel like it is pretty clean and fits nicely. I'd say it's creatively thought out, if you asked me... :-)

CeylonMigrationBot commented 9 years ago

[@gavinking]

I'm imagining it's a typo?

Yes, thanks, I fixed it.

Shouldn't it be a String?

Yes of course. Also fixed.

I'd say it's creatively thought out, if you asked me... :-)

Excellent :-)

CeylonMigrationBot commented 9 years ago

[@gavinking] OK, after reflecting on this issue this morning, finally taking into account the changes to the language that resulted from the addition of constructors, along with what is proposed in #4235, I have a completely different proposal. The code would look like this:

shared interface Summable<T> of T
        given T satisfies Summable<T> {

    shared formal static T zero;
    shared formal T plus(T t);

}

shared class Num
        satisfies Summable<T> {

    shared actual static Num zero => Num(0);

    Integer int;
    shared new (Integer int) {
        this.int = int;
    }

    shared actual Num plus(Num num) 
            => Num(int+num.int);

}

shared T sum<T>({T*} ts) 
        given T satisfies Summable<T> {
    value result = T.zero;
    for (t in ts) {
        result += t;
    }
    return result;
}

Some notes on this:

  1. static might not be quite the right word for this, but let's go with it for now. It will work fine, unless we think of something much better.
  2. Only classes with constructors may have static members.
  3. static members if a class must come right at the top of the class before constructors and initialization logic!
  4. static members of interfaces can occur anywhere in the body of the interface.
  5. The usual refinement rules apply to static members!
  6. When a static member is called on an instance (num.zero) or on a type parameter (T.zero), the invocation is polymorphic! When a static member is called on a class or interface (Num.zero) it is resolved statically at compile time.
  7. As a caveat to the last point, I'm not completely sure how we should treat calls like num.zero with respect to polymorphism. It would probably be better to say, at least initially, that implementations of static members in classes may not be formal nor default, and that static members of interfaces may not be directly called, except on a type parameter. That would make this more natural to implement for the JVM, and I guess remove an ambiguity where t.zero and T.zero can be different values in the body of sum(). I need to think this through a little further.
  8. Another—bigger—issue I need to think through is how to control that T isn't a union type at runtime. If Summable were covariant above, then you could get Integer|Float as the runtime value of T. In which case T.zero is ambiguous. So there I guess has to be some kind of constraint that sum() can only be called with a "sufficiently concrete" T. In fact this is closely related to the previous issue 8.

What's great about this proposal is:

  1. It doesn't break the block structure of the language. The rules about what is referenceable within the definition of a static member are exactly the same as for any other member. This was the consideration that constrained the design of constructors, and the solution here is essentially the same. (Control the order.)
  2. It recovers all the beloved (by some) functionality of static in Java, C++, C#, etc.
  3. It solves the problem of abstracting over monoids, which has bothered me for years, simply and cleanly. The code above is easy to read and understand and I didn't have to mention the word "monoid" anywhere. That's perfect.
  4. Much of the machinery for this already exists, since we already had to build it in order to interop with Java statics. The only bit that needs a little bit of creativity is how to call T.zero on a type parameter. But that should be straightforward, since the TypeDescriptor already carries a reference to the class.

I think this would be great, and really put the icing on the cake of our type system.

CeylonMigrationBot commented 9 years ago

[@gavinking] Yeah, so I think there's several ways that this could potentially work:

  1. static members of classes must be "final" (i.e. non-formal non-default) and we will need a new sort of type constraint for type parameters that accept only classes as arguments, i.e. given class T satisfies Summable<T>.
  2. Alternatively, we could use the member itself as a constraint: given T satisfies Summable<T> abstracts zero.
  3. Alternatively, we could prevent types with non-"final" static members from being used as type arguments. So Integer|Float would not be a legal argument to given T satisfies SomeCovariantSelfType<T> since Integer|Float doesn't have a concrete implementation of SomeCovariantSelfType as a supertype. That sounds a bit complicated to explain though.
  4. Finally, one option would be to say that only invariant or contravariant self type interfaces can declare formal statics. Thus, Summable above would be OK, but you couldn't have a covariant type WithZero<out T>. That sounds limiting, but perhaps it is actually a correct model of the problem. I need to dig deeper into this option.

@RossTate any thoughts?

CeylonMigrationBot commented 9 years ago

[@Zambonifofex] @gavinking Hrm. I'm not sure if I like that so much... Personally, I still prefer using the is clause. Your static members proposal makes it so that classes cannot be instances of other classes, or interfaces. Also, the fact that static members need to be at the top of the class to actually make sense feels weird to me. The is syntax feels more intuitive and good looking in my opinion...

CeylonMigrationBot commented 9 years ago

[@RossTate] Once you add static methods, it's not clear to me that Summable makes sense to be a type, which hints at material-shape separation. How does subtyping interact with static methods?

In fact, treating this differently from subtyping solves the constraint problem you're talking about. Suppose Pointed<out T> of T has a static T point method, and String and Integer are each Pointed. With principal instantiation inheritance, we would say that String&Integer is a subtype of Pointed<String&Integer>. The reason this is sound with non-static methods is that we know any instance of String&Integer would have to be an instance of Pointed<String&Integer> due to our inheritance requirements. All the methods are invoked on instances, so we don't have to worry about the situation where there is no instance of String&Integer. But with static methods, that justification no longer applies, which screws up subtyping.

So the more I think about this (I hadn't actually thought of the problems you mentioned before), the more I think we should have a separate construct for this pattern.

CeylonMigrationBot commented 9 years ago

[@tkaitchuck] I don't think the keyword 'static' is a very good one. It is being used quite differently than Java's static. This is bad because it will lead to confusion, and people wanting it to be more like Java's static, which would be a really bad idea. (Java style statics mess up initialization, serialization, IOC, threading, and makes implementing a garbage collector massively harder.)

Perhaps 'Instance' might be a better word? - This has the advantage that it subtly implies points 3 and 4 above.