Kotlin / KEEP

Kotlin Evolution and Enhancement Process
Apache License 2.0
3.3k stars 356 forks source link

Compile-time Extension Interfaces #87

Closed raulraja closed 3 years ago

raulraja commented 6 years ago

Proposal

The goal of this proposal is to enable compile-time dependency resolution through extension syntax. Overall, we want to enable "contract" interfaces to be declared as program constraints in function or class constructor arguments and enable the compiler to automatically resolve and inject those instances that must be provided with evidence in one of a given set of scopes. The resolution scopes to look for those evidences are predefined and ordered, more details inside the proposal.

In the case of not having evidence of a required interface (program constraints), the compiler would fail and provide the proper error messages both by IntellIJ IDEA inspections and compile-time errors.

This would bring first-class named extension families to Kotlin. Extension families allow us to guarantee that a given data type (class, interface, etc.) satisfies behaviors (a group of functions or properties) that are decoupled from the type's inheritance hierarchy.

Unlike the traditional subtype style composition where users are forced to extend and implement classes and interfaces, extension families favor horizontal composition based on compile-time resolution between types and their extensions.

Current code status

Thanks to @truizlop and @jorgecastilloprz,we already have a working POC where we've coded a big part of the KEEP-87. There are still some additional behaviors we'd need to code that have been clarified on the KEEP though, and we'd love to get JetBrains help on iterating those. Instructions about how to test it have been provided within the "How to try?" section of the keep, where we have also provided a code sample.

Rewording

We have revamped the initial proposal given Type Classes are not really our aim here, but compile-time dependency resolution for Kotlin. The nature of the proposal has not changed a lot, just suffered a major refactor so it's more in the line of what a generic language feature like this one would be, and also to fit the compiler implementation feedback we've got from some JetBrains devs these later months.

We believe the compiler could support compile-time dependency resolution out of the box, as other modern laguages already do. We believe some infrastructure patterns would fit better as built-in language features, so users can get their canonical solution for that already provided and focus their work on the actual domain requirements and business logics of their programs. We've seen similar things in other languages that provide built in support for concurrency and other stuff.

How to contribute

Want to help to bring compile time extension resolution to Kotlin?. A fork is being provisioned where a reference implementation based on this proposal will take place at https://github.com/arrow-kt/kotlin

elizarov commented 6 years ago

The proposed syntax for companion constrains looks a bit confusing. Take a look at this example:

typeclass Monoid {
    fun combine(b: Self): Self 
    fun Self.Companion.empty(): Self
}

It is not perfectly clear from this declaration what is the receiver type of combine and empty functions. It is, in fact, Self for combine and Self.Companion for empty, but the syntax is not regular enough to make it clear.

I was thinking a bit on how to fit companion-extension syntax in a nicer way and came up with the following alternative syntactic approach.

Let's eschew the idea that extension declaration is a way to group multiple extension functions together, but instead leave the regular extension function syntax intact inside both typeclass and extension scopes like this:

typeclass Monoid {
    fun Self.combine(b: Self): Self
    fun Companion.empty(): Self
}

Here, in the context of typeclass declaration (and only there) we treat Self and Companion as context-sensitive keywords that refer, respectively, to the actual class that this typeclass applies to and its companion (regardless of how its companion is in fact named). Now, when declaring the actual extension we'll have:

extension Int : Monoid {
    fun Int.combine(b: Int): Int = this + b
    fun Int.Companion.empty(): Int = 0
}

In this approach extension block serves just a wrapper and an additional safety net. It does not bring any scope with it. All the functions declared inside extension are going to be treated normally (as if they were on the top level), but compiler will complain if they do not match the signatures of the corresponding type-class.

Unfortunately, this proposal also opens the whole can of worms on companion-vs-static debate. One of the issues being the fact, that not all classes in Kotlin (and even in Kotlin stdlib) have companions and there is currently no way to declare a companion outside of the class. Moreover, it is hardly possible to fix or to work around due to dynamic nature of JVM.

One solution to this conundrum is to stop piggybacking onto companion objects at all. Let us, instead, declare functions on the typeclass itself, like this:

typeclass Monoid {
    fun Self.combine(b: Self): Self
    fun empty(): Self
}

In this approach typeclass does declare a scope in a similar way to interface declaration. We interpret combine as a function with two receivers (an instance of the typeclass and Self), while empty has only one receiver (an instance of typeclass). In fact, it meshes perfectly with how the typeclasses are going to be represented on JVM anyway. Now the extension declaration is going to look like this:

extension Int : Monoid {
    fun Int.combine(b: Int): Int = this + b
    fun empty(): Int = 0
}

The interesting thought direction here is that this kind of extension mechanism can work as a replacement for companion objects altogether. We might just deprecate companions, because any declaration of the form:

class Foo {
    companion object {
       /** body **/
    }
}

can be replaced, for the same effect, with

class Foo {
    extension Foo {
       /** body **/
    }
}

or, which is not possible with companion objects, if the access to private members of Foo is not needed, it can be moved outside:

class Foo {
}

extension Foo {
    /** body **/
}

It is cleaner and more modular, since you are not limited to one companion object anymore, but can define as many as you want.

However, with this alternative-to-companion approach to typeclasses we have to work out on how the typeclass instances are going to be named and what is the syntactic approach to explicitly specify this name (this is, at least, needed for JVM interop). We can also try to somehow reuse companion modifier for extension blocks by imbuing it with an additional syntax. Lots of things to think about.

raulraja commented 6 years ago

@elizarov makes perfect sense. I'll update the proposal to the new proposed syntax

mikehearn commented 6 years ago

I realise all this stuff comes from Haskell and people who love category theory, but I strongly suggest that to gain better acceptance you drop the conventional FP names for things and use more appropriate names that are less jargony and more Java-like:

Monoid -> Combinable Functor -> Mappable

etc

raulraja commented 6 years ago

@mikehearn those are just examples and not part of the proposal. Anyone is free to name their own typeclasses whatever they want and this proposal does not address what names users will choose nor it proposes new types for the stdlib. Also the examples don't come from Haskell. Those names and type classes I used as examples are part of many libraries and langs by now including libs in Kotlin.

dnpetrov commented 6 years ago

How type class implementations (extensions) are resolved? When two implementations of the type classes are in conflict?

Elaborating on the example with ProdConfig and TestConfig, suppose I have

package monoid

typeclass Monoid {
    fun Self.combine(b: Self): Self
    fun empty(): Self
}

and, for example,

package org.acme

extension Int : Monoid {
    fun Int.combine(b: Int): Int = this + b
    fun empty(): Int = 0
}

and

package org.fizzbuzz

extension Int : Monoid {
    fun Int.combine(b: Int): Int = this * b
    fun empty(): Int = 1
}

Now, in package myapp, I want to use one particular implementation of Monoid. What should I import?

raulraja commented 6 years ago

@dnpetrov an import matching the contained instances would bring them into scope:

import org.fizzbuzz.*

In the case where you have two instances that are conflicting the compiler should bail because they are ambiguous and the user needs to disambiguate them.

import org.fizzbuzz.*
import org.acme.*

val x: Int = 1.combine(2) // // Ambiguous instances for Monoid: Int. Please redefine`extension Monoid: Int` in the current scope or use tagged instances if they provide different behavior.

A natural way for disambiguation may be redefining the instance in the current scope which would take precedence over the imported one. If the compiler generated synthetic names for the instances or supported users naming them you could just import the FQN.

import org.fizzbuzz.monoidIntAdd 

Another approach would be to place further constrains in the instances like is done today with the <A: Any> trick where it is used to ensure non nullable values are always passed as args to functions.

typeclass Add : Int
typeclass Multiply : Int

extension Int : Monoid : Add {
    fun Int.combine(b: Int): Int = this + b
    fun empty(): Int = 0
}

extension Int : Monoid : Multiply {
    fun Int.combine(b: Int): Int = this * b
    fun empty(): Int = 1
}

fun <Tag, A: Monoid: Tag> combine(a: A, b: A): A = a.combine(b)

combine<Add, Int>(1, 1) // 2
combine<Multiply, Int>(1, 1) // 1
combine(1, 1) // Ambiguous instances for `Monoid: Int`. Please redefine`extension Monoid: Int` in the current scope or use tagged instances if they provide different behavior.
elizarov commented 6 years ago

Let's assume the example by @dnpetrov

package org.acme

extension Int : Monoid {
    fun Int.combine(b: Int): Int = this + b
    fun empty(): Int = 0
}

It is pretty straightforward with star imports:

import org.acme.*

To disambiguate extension in the case of start import from the other package, I suggest to use the simple name of the class that extension is being defined for, like this:

import org.acme.Int

This will bring in scope all Int extensions defined in this package.

This approach has a nice and quite well-intended side-effect that if a class and its extension are both defined in the same package, then a qualified import of a class automatically brings into scope all the extensions defined for this class in the same package. We can even go as far as to forbid redefining class extensions in other packages if they are defined in the original package. This will improve predictability and is quite consistent with a class member vs extension resolution in Kotlin (class member, being defined by the class author, always takes precedence).

However, if there is no extension in the class's original package, then different clients can define and/or import different extension instances for this class with the same typeclass.

Also note, that a generic method with typeclass upper-bounds should have an implict instance of typeclass in scope behind the scenes (actually passed to it by invoker) and this instance should always take precedence.

elizarov commented 6 years ago

The interesting side-effect of this proposal is that we can almost deprecate reified modifier and reimplement it with type-classes instead (not really):

typeclass Reified {
    val selfClass: KClass<Self>
}

Now a function that was doing something like:

fun <reified T> foo() { .... T::class ... }

can be replaced with:

fun <T : Reified> foo() { .... T.selfClass ... }

This analogy might help to understand how typeclasses shall behave in various scenarios. In particular, generic classes with typeclass upper bounds shall be supported if and only if we roll out support of generic classes with reified type parameters.

raulraja commented 6 years ago

@elizarov that will simplify most of our code a ton since we are currently relying on inline <reified A>... all over the place and we frequently find places inside generic classes where we can't use those methods with the classes type args.

raulraja commented 6 years ago

Updated the proposal with a section on overcoming some of the limitations of inline reified with Typeclass constrain evidences as demonstrated in https://github.com/Kotlin/KEEP/pull/87#issuecomment-333915196

dnpetrov commented 6 years ago

What if I have multiple type class implementations for the same type in the same package? Like

package org.acme.bigint

class BigInt { ... }

extension BigInt : Monoid { ... }
extension BigInt : Show { ... }

I'd expect extensions to have names or something. Implicit imports look somewhat fragile. You don't know what else you bring into the scope.

Maybe even just follow familiar class inheritance syntax:

typeclass Monoid<T> {
    fun T.combine(b: T): T
    fun empty(): T
}

extension AdditiveInt : Monoid<Int> {
    override fun Int.combine(b: Int): Int = this + b
    override fun empty(): Int = 0
}

Then typeclass becomes just a special kind of interface, and extension - a special kind of object, with its members implicitly imported in file scope, and all regular resolution rules just work as they should. Note that you also get multi-parameter type classes for free.

dnpetrov commented 6 years ago

With parametric type classes, type class dependent functions might need some different syntax, like:

fun <T> combineWith(others: Iterable<T>): T where Monoid<T> { ... }
raulraja commented 6 years ago

@dnpetrov As for the syntax makes sense. That syntax is similar to what I originally proposed as well since it is also necessary to implement the fact that certain type classes can't target a single type but target the combinations of many. For example a natural transformation to go from one type to another:

typeclass FunctionK<F<_>, G<_>> {
  fun <A> invoke(fa: F<A>): G<A>
}

extension Option2List : FunctionK<Option, List> {
  fun <A> invoke(fa: Option<A>): List<A> =
    fa.fold({ emptyList() }, { listOf(it) })
}

In the case above the instance does not target neither Option nor List but those are part of the type args and folks should be able to summon the instance somehow if they can prove that such instance exists for both types. I'm not sure how that case could be expressed with extension syntax.

@dnpetrov In your example above an alternative way to express combineWith could be:

fun <T> combineWith(others: Iterable<T>, instance MT : Monoid<T>): T = ...

That would allow call sites to replace the instance by explicitly providing one if needed so the method could be invoked in a similar way as:

combineWith(others) //compiles if Monoid<T> instance is found and fails if not.
combineWith(others, customInstance) // always compiles

That also is an alternative way to handle ambiguity when collisions exists beside using concrete imports.

@elizarov @dnpetrov thoughts? Thanks!

dnpetrov commented 6 years ago

@raulraja Looks good. Need some time to reshuffle my thoughts and probably show this to the owner of Resolution & Inference. @erokhins pls take a look...

elizarov commented 6 years ago

The latest syntactic proposal by @dnpetrov for typeclass declaration is indeed more general and is easier to adapt to to the can of typeclasses binding multiple types.

However, I don't like the need to name typeclasses and more so the need to declare and name typeclass instances in function signature. It is just a boilerplate. IMHO, it defeats the whole purpose of having type-classes as a language feature, since you can already do all that.

At this point I would recommend to (re)read Cedric's blog post on ad-hoc polymorphism. In our example, we can already write just:

interface Monoid<T> {
    fun T.combine(b: T): T
    fun empty(): T
}

then write

object AdditiveInt : Monoid<Int> {
    override fun Int.combine(b: Int): Int = this + b
    override fun empty(): Int = 0
}

then declare functions with an explicit type class instance:

fun <T> combineWith(others: Iterable<T>, MT : Monoid<T>): T = ...

It is even shorter than the latest syntactic proposal for typeclasses, since I don't have to add instance modifier before MT. So, what is the value of the type classes proposal then?

IMHO, the only valuable benefit of having type classes directly supported in the language is making them easier to grasp for beginners by capitalizing on the mental model of interfaces on steroids (aka Swift protocols), and eschewing as much boilerplate as we can in the process. Having to name your type class instance is just a boilerplate, having to pass your type-class instances explicitly is just a boilerplate. Neither should be required.

cbeust commented 6 years ago

A nitpick but in the examples, I think you should be more specific about the name of the Monoid types, e.g. MonoidAddition, MonoidMultiplication, etc...

raulraja commented 6 years ago

@elizarov Kategory uses interfaces for typeclasses now and does it like that. The issue is that:

fun <T> combineWith(others: Iterable<T>, MT : Monoid<T>): T = ...

Requires explicitly passing the instance manually which defeats the purpose of type classes. Having a keyword to require the compiler to verify that call sites such as combineWith(listOf(1)) compiles because AdditiveInt is in scope is the real feature because we get compile time verification of constrains.

I agree though that the variant based on extensions is much easier to grasp for beginners. We can perhaps encode all complex cases where typeclasses require multiple types if typeclasses allowed for type args:

typeclass FunctionK<G<_>> {
  fun <A> Self<A>.convert(): G<A>
}

extension Option : FunctionK<List> {
  fun <A> Option<A>.convert(): List<A> =
    this.fold({ emptyList() }, { listOf(it) })
}

val x: List<Int> = Option(1).convert() // [1] 
val y: List<Int> = None.convert() // [ ] 

And in a polymorphic context:

fun <G<_>, F<_>: FunctionK<G>, A> convert(fa: F<A>): G<A> = fa.convert()
val l: List<Int> = convert(Option(1))
val s: Set<Int> = convert(Option(1)) // No `Option: FunctionK<Set>` extension found in scope.
elizarov commented 6 years ago

Having played more with @dnpetrov syntactic proposal I start to like certain parts of it. First of all, the generic-like declaration for interface indeed looks more concise than my original proposal with the Self keyword. Being able to eschew the "magic" nature of Self is also good.

Moreover, instead of a having to reserve a new typeclass keyword we can just be explicit about the fact that it just an interface, but tag this interface with an extension modifier like it is customary in Kotlin's syntactic tradition:

extension interface Monoid<T> {
    fun T.combine(b: T): T
    fun empty(): T
}

This syntax also scales to multiple types (and, maybe HKTs, but I will not cover them here). For example, you can have:

extension interface Bijection<A, B> {
    fun A.toB(): B
    fun B.toA(): A
}

However, as soon as you allow applying this syntax to multiple types, the story with "standalone" (static? but not really) functions like empty becomes complicated. We do want those functions to extend "the type" and we want them being applicable on the type itself (like companion object functions for the type). You should be able to use Int.empty() expression in the context of Monoid<Int>, but with multiple types it becomes unclear which type is being extended this way.

In order to overcome this difficulty, we can use some fresh syntactic construct to bind the type to its typeclass. Look at how the dot (.) can work for this purpose and how it is consistent with Kotlin's approach to extension functions:

extension interface T.Monoid {
    fun T.combine(b: T): T
    fun empty(): T
}

In this context, T will be called "an extension receiver type" (that is, the type that receives the extension).

Note, that we can allow to add additional types in <...> after the extension interface name and allow "receiver-less" extensions like Bijection, too.

Similarly, declaration of typeclass instances can ready follow the model of Kotlin's companion object. Name can be optional, too, so that an instance without a name can be written like this:

extension object : Int.Monoid {
    override fun Int.combine(b: Int): Int = this + b
    override fun empty(): Int = 0
}

Since we have indicated that Int is an "extension receiver type", it becomes clear and syntactically transparent on why one can write Int.empty() in the context of this extension object. It is also clear on how to provide an explicit name, if needed.

Alas, this notation makes it less obvious on how to specify typeclass constraints for functions. The interface-inspired syntax of T : Monoid does not look consistent with the declaration of Monoid anymore and breaks altogether with receiver-less type classes like Bijection<A,B>. I don't think that reusing a where clause works either, but we can introduce a new clause for this purpose, for example:

fun <T> combineWith(others: Iterable<T>): T given T.Monoid = ...
dnpetrov commented 6 years ago

@elizarov

However, as soon as you allow applying this syntax to multiple types, the story with "standalone" (static? but not really) functions like empty becomes complicated. We do want those functions to extend "the type" and we want them being applicable on the type itself (like companion object functions for the type). You should be able to use Int.empty() expression in the context of Monoid, but with multiple types it becomes unclear which type is being extended this way.

This approach has some issues with overloading. E.g., what if I have several different type classes (extension interfaces, whatever) for T with method empty()? I'd suppose it should rather be more explicit, like in https://github.com/Kotlin/KEEP/pull/87#issuecomment-334096622, where type class implementation is passes as a named parameter, which has special default value resolution strategy.

I'd expect something like:

extension interface Monoid<T> {
    fun T.combine(b: T): T
    fun empty(): T
}

extension object AdditiveInt : Monoid<Int> {
    override fun Int.combine(b: Int): Int = this + b
    override fun empty(): Int = 0
}

extension object StringConcatenation : Monoid<String> {
    override fun String.combine(b: String): String = this + b
    override fun empty(): String = ""
}

fun <T> T.timesN(n: Int, extension monoid: Monoid<T>): T {
    var result = empty() // 'monoid' is an implicit dispatch receiver
        // or, if you need to disambiguate, 'monoid.empty()'
    for (i in 0 until n) {
        result = result.combine(this) // 'monoid' is an implicit dispatch receiver, 'combine' is a member extension
          // or 'monoid.run { result.combine(this) }'
    }
    return result
}

fun test() {
    assertEquals("abcabcabcabc", "abc".timesN(4))
}
dnpetrov commented 6 years ago

Now, when I look at my example above, it starts to remind me "extension providers" from Xtend (https://www.eclipse.org/xtend/documentation/202_xtend_classes_members.html#extension-methods). Probably we'll also need 'extension val's.

@raulraja @elizarov I think we need some more practical motivating example. Yeah, all those Monoids and BiFunctions and HKTs thrown in here and there are nice. Can we think of a concise, meaningful piece of code to demonstrate why type classes (extension interfaces) make life better? Like HTML builders for extension lambdas, for example.

d3xter commented 6 years ago

One possible motivation could be ad-hoc polymorphism as described by Cedric. An Example is Limiting the possible values put into a JSON.

For example in klaxon to create a new Json-Object you have the following method: fun obj(vararg args: Pair<String, *>): JsonObject Link

This basically allows everything to be put inside. Other JSON-Libraries like json-simple have many methods called putString/putInt and so on.

With typeclasses we can define a JSONValue:

typeclass T.JSONValue {
    fun T.toJSON() = String
    fun fromJSON(String) = T
}

extension object : Int.JSONValue {
    fun Int.toJSON() = this.toString()
    fun fromJSON(a: String) = a.toInt()
}

Then all you need inside your JSON-Library is a method like that:

fun <T> JsonObject.put(key: String, value: T) given T.JSONValue { ... }

raulraja commented 6 years ago

@dnpetrov here are a few:

Type classes cover many use cases, here are a few interesting ones syntax aside:

Compile-Time Dependency Injection

package common

extension interface Dependencies<T> {
    fun config(): Config
}
package prod

extension object ProdDeps: Dependencies<Service> {
    fun config(): Config = ProdConfig
}
package test

extension object TestDeps: Dependencies<Service> {
    fun config(): Config = TestConfig
}
import test.*

Service.config() // TestConfig
import prod.*

Service.config() // ProdConfig

In the example above we used config but you can apply a similar case to the Android Activity which managing it one of the biggest issues for developers.

Type restrictions for safer operations

extension interface Numeric<T> {
    fun T.plus(y: T): T
    fun T.minus(y: T): T
}

extension Fractional<T> : Numeric<T> {
    def T.div(y: T): T
}

extension object : Numeric<Int> {
    override fun Int.plus(y: Int): Int = this + y
    override fun Int.minus(y: Int): Int = this + y
}

extension object : Fractional<Double> {
    override fun Double.plus(y: Double): Double = this + y
    override fun Double.minus(y: Double): Double = this + y
    override fun Double.div(y: Double): Double = this / y
}

fun <T> add(x: T, y: T, extension N: Numeric<T>): T = x.plus(y)
fun <T> div(x: T, y: T, extension F: Fractional<T>): T = x.div(y)

add(1, 1) // 2
add(1.0, 1.0) // 2.0
div(3.0, 5.0) // 0.6
div(3, 5) // No `Fractional<Int>` instance found in scope.
//In Kotlin now 3/5 == 0 loosing precision.

Type safe composable encoder / decoders and reflectionless instrospection

Expanding on what @d3xter mentions in https://github.com/Kotlin/KEEP/pull/87#issuecomment-334422411

sealed class JValue {
    object JNull: JValue()
    class JString(s: String): JValue()
    class JDouble(num: Double): JValue()
    class JDecimal(num: BigDecimal): JValue()
    class JInt(num: Int): JValue()
    class JBool(value: Boolean): JValue()
    class JObject(obj: List<JField>): JValue()
    class JArray(arr: List<JValue>): JValue()
}

typealias JField = Pair<JString, JValue>

extension Writes<T> {
  fun writes(o: T): JValue
}

extension Reads<T> {
  fun reads(json: JValue): T
}

extension Format<T> : Reads<T>, Writes<T>

//Primitive to JValue instances omitted for simplicity

fun <T> T.toJson<T>(instance tjs: Writes<T>): JValue = tjs.writes(this)

fun <T> JValue.to(intance fjs: Reads<T>): T = fjs.reads(this)

data class Person(val name: String, val age: Int)

object extension Writes<Person> {
    fun writes(o: Person): JValue = JObject(listOf(
        "name".toJson() to o.name.toJson(), // fails to compile if no `Reads<String>` is defined
        "age".toJson() to o.age.toJson(), // fails to compile if no `Reads<Int>` is defined
    ))
}

Person("William Alvin Howard", 91).toJson() // if both `Reads<String>` `Reads<Int>` is defined : { "name": "William Alvin Howard", age: "91" }

The most important use case of them all is that users and library authors can write code with typeclasses and default instances providing complete solutions to their users. At the same time users have the power to override library exposed typeclasses behavior at any time by providing other instances in scope. In the example above if you need keys serialized with Pascal Case style the user can just override the instance and leave the rest of the library intact:

Defined in user code (not in the json library), this implementation capitalizes the keys.

object extension Writes<Person> {
    fun writes(o: Person): JValue = JObject(listOf(
        "Name".toJson() to o.name.toJson(), // fails to compile if no `Reads<String>` is defined
        "Age".toJson() to o.age.toJson(), // fails to compile if no `Reads<Int>` is defined
    ))
}

Person("William Alvin Howard", 91).toJson() // { "Name": "William Alvin Howard", "Age": "91" }

The library author did not have to add specific support for Pascal Case because overriding a tiny bit of functionaly is easy enough. Entire system and libraries can be composed with typeclass evidences deferring dependencies to users but providing reasonable defaults that are verified at compile time therefore making programs more secure and making easier not to recurre to runtime reflection, costly initialization graph of dependencies etc.

I'll be happy to provide more use cases if needed but I believe the controlled and safe power type classes would bring to Kotlin is a great step forward in terms of compile time verification, type safety and performance for many common patterns like the ones demonstrated above.

There is also the use case of supporting typed FP with all the abstractions we used in examples previously but I will not bring those examples here because Type Classes is much more than FP, it's all about polymorphism and evidence based compile time verification opening the door to many other use cases most of them in trivial tasks users face in daily programming.

elizarov commented 6 years ago

@dnpetrov I would suggest that "type extension" like empty in Monoid are added to the scope of the corresponding type, so the examples below should just work:

fun <T> T.timesN(n: Int): T given T.Monoid {
    var result = T.empty() // because T is a Monoid.
    for (i in 0 until n) {
        result = result.combine(this) // again, because we are given that T is Monoid
    }
    return result
}
raulraja commented 6 years ago

@elizarov is :T given T.Monoid<T> needed or would it be :T given Monoid<T> ?

elizarov commented 6 years ago

@raulraja That was a type. Fixed, than you. You can use either :T given Monoid<T> or :T given T.Monoid depending on the definition of Monoid -- either as extension interface Monoid<T> or extension interface T.Monoid.

The problem with the former, as was discussed before, is that it is not clear how to disambiguate the functions like empty without a Scala-like implicit parameters, which I particularly despise, because they wreck the symmetry between declaration and use of a function.

With the later declaration, given T.Monoid we can use T.empty() to disambiguate. The open questions, which I cannot answer without a larger design discussion and analysis of use-case if whether we should support unqualified empty() at all and whether a declaration of Monoid<T> (without a receiver type) with receiver-less empty() function should be even allowed.

elizarov commented 6 years ago

There are more open questions in this whole design, though. They mostly pertain to various corner cases, ambiguities, and the ways to disambiguate them. We can ponder on them abstractly, but in reality it needs a prototype to play with and figure it out.

For example, consider that case of multiple typeclass implementations for the same type. As was pointed out by @dnpetrov, we can have both additive and multiplicative monoid on integers. In this case, we should clearly give them distinct names. Here I'm using a syntactic tradition that is established by named/unnamed companion objects in Kotlin:

extension object AdditiveInt : Int.Monoid {
    override fun Int.combine(b: Int): Int = this + b
    override fun empty(): Int = 0
}

extension object MultiplicativeInt : Int.Monoid {
    override fun Int.combine(b: Int): Int = this * b
    override fun empty(): Int = 1
}

How do we bring those in scope and use them? First of all, it clearly makes sense to require their explicit import, although this somewhat conflicts with the notion that named companion objects still appear in scope of their parent classes implicitly.

So, importing them to global scope can be accomplished via import AdditiveInt.*. Importing them into local scope can be accomplished using with(AdditiveInt) { ... }. Here, we'll have to strengthen a notion that extensions objects are objects just like any other Kotlin object and you can refer to their instance by the name and bring them into the scope via extension lambdas.

Now, this sheds a completely new light on the given XXX syntax. Take a closer look a signature of this function's declaration:

fun <T> T.timesN(n: Int): T given T.Monoid { ... }

This is the function with multiple receivers! It has an extension receiver of type T and a secondary receiver of extension interface type T.Monoid. Now, it is a good moment to note that the need to support multi-receiver functions in Kotlin has been popping now and then. There were even some concrete proposals and (I think) even prototypes, but they had never materialized partly due to the lack of a clear (non-ugly) syntax to declare such multi-receiver functions.

Here we have a chance to unify it all under a common and clear syntax. So far, I see no reason while the given clause should be limited only to extension interfaces (typeclasses). There are all the reasons to allow arbitrary types in the given clause.

Note, that this kind of support for the given clause (that is, support for functions with multiple receivers) is quite a powerful feature in itself that can solve many of use-cases that were outlined as "typeclasses use-cases", but in reality those are just use-cases for functions with multiple receivers.

elizarov commented 6 years ago

Now, it becomes apparent that a notion of a type class or extension interface is superfluos for may real-life use-cases. For example, with the support of the given clause at hand, the Bijection can be declared as a regular generic interface. There is nothing special about it.

interface Bijection<A, B> {
    fun A.toB(): B
    fun B.toA(): A
}

However, a Monoid<T> interface, as noted above, presents a unique challenge, since it have a receiver-less empty() function that makes it hard to disambiguate empty() invocation in the presence of multiple receivers of Monoid interface type. According to the usual Kotlin rule, the empty() invocation will get resolved to the innermost receiver. However, it does not really help. Kotlin's resolution rules in the scope of multiple receivers are already quite user-unfirendly and are a source of various puzzlers, so we should careful with the given clause as it will only make matters worse.

If the nowadays "plain" Kotlin we have "qualified this" expression (this@xxx) that lets you work around the issue, by allowing to use a qualified this to disambiguate invocation in the presence of multiple this receivers in the context. It seem that we can use the same approach to disambiguate receivers that are brought into scope by the given clause. We can optionally support labelled given clauses, like this:

fun <T> T.timesN(n: Int): T given monoid@ Monoid<T> { ... }

Now, this seems to be a funny syntax at the first sight, until you recognize on how it is consistent with other ways to bring named receivers in scope with Kotiln. Inside the body of the function you can use this@monoid.empty() if empty invocation would have been ambiguous or would have resolved to a wrong function. We have to be carefully in the complier frontend, though, to make sure that a collection of unnamed given clauses is unordered and make sure that compiler reports an ambiguity when faced with an ambiguous unqualified invocation in this case.

All of the above lets us completely eschew the notion of extension interface or typeclass from this proposal and narrow it down to support of the given clause. Occam's razor in action.

d3xter commented 6 years ago

Correct me if I'm wrong, but would that mean that a given Monoid<T> would make sure that a object or extension object that implements Monoid<T> is available for the type T?

If yes, how can we disambiguate if a class contains multiple objects implementing Monoid<T>?

class MyInt(val i: Int) {
    object AdditiveInt: Monoid<MyInt> {
    }

    object MultiplicativeInt: Monoid<MyInt> {
    }
}
dnpetrov commented 6 years ago

@elizarov

fun <T> T.timesN(n: Int): T given T.Monoid { ... }

I'm not sure I like it, this way you can't refer to type class implementation by name if you need to disambiguate.

raulraja commented 6 years ago

@elizarov reducing it all to given looks good if we can express the same semantics. @elizarov @dnpetrov should I go ahead and update the proposal to reflect this last proposed syntax?

elizarov commented 6 years ago

@d3xter

how can we disambiguate if a class contains multiple objects implementing Monoid?

class MyInt(val i: Int) {
    object AdditiveInt: Monoid<MyInt> { ... }
    object MultiplicativeInt: Monoid<MyInt> { ... }
}

You disambiguate this by explicitly bringing one or another object in scope. You can either do with(AdditiveInt) { ... } or with(MultiplicativeInt) { ... }.

elizarov commented 6 years ago

@raulraja I'm still not entirely happy with the latest reductionists solution. It works perfectly when you have named objects that implement the corresponding interfaces and when you explicitly bring them in scope using either import AdditiveInt.* or with(AdditiveInt) { ... }.

However, I am not satisfied with how it works when there is only one way to implement the typeclass (extension interface) for a given class. Consider this deliberately trivial example:

interface Dumpable<T> {
    fun T.dump()
}

First of all notice, how the concept of a typeclass (extension interface) became a convention rather than a language feature. Basically, we agree to create the special interfaces that are designed to be implemented outside of their base class, so they by convention have all their functions with a receiver of generic parameter type. This is good. It works well when you do indeed declare this extension implementation outside of the class:

object DumpableInt : Dumpable<Int> {
    fun Int.dump() = println("Int($this)")
}

However, consider me writing my own class where I, as this class author, would like to provide the implementation of Dumpable typeclass:

class MyData(private val x: Int) {
   fun dump() = println("MyData($x)")
}

But it does not work. I cannot write MyData : Dumpable<MyData> as it makes no sense. I will have to write the following boilerplate using a companion object:

class MyData(private val x: Int) {
   companion object : Dumpable<MyData> {
       fun MyData.dump() = println("MyData($x)")
   }
}

This is ugly and verbose. A lot of repetition. Worse, it does not feel natural. We need to figure out this part of the story.

raulraja commented 6 years ago

@elizarov For what is worth type classes are never implemented in any language with an inheritance relationship. The purpose of type classes is to decouple behavior from data and provide that behavior to arbitrary datatypes specially those the user did not write.

If you couple a type class to a datatype with an inheritance relationship then nobody can override that behavior. MyData : Dumpable<MyData> wouldn't be valid for instance resolution either since it's a regular class not an extension.

The keyword extension is what it would make a type class instance a suitable candidate for the compiler to verify at call sites. For example:

fun <A> A.`(╯°□°)╯︵ ┻━┻) `() : Unit given A.Dumpable = this.dump()

val md = MyData(1)
md.`(╯°□°)╯︵ ┻━┻) `() // No Dumpable extension found for `MyData`

In the example above the compiler can verify that no extension is provided. If the user was able to define the extension in the same class it targets nobody could override it. Being able to do such a thing resembles more structural types where you can request any object that has a dump() method on it but IMO is outside the scope of type classes.

I think the issue here is trying to fit type classes or extension groups in regular interfaces. Using a custom keyword such as typeclass or a prefix to interface that would make them a special kind of interface. For example: type interface MyTypeclass<A> would make the distinction clear.

Another issue with making them just interface is that all typeclasses need to have at least one generic type arg. There can't be a type class that is not parametric to at a minimum one type.

elizarov commented 6 years ago

@raulraja I tend to agree (now) that we should not (ab)use inheritance relations. Having toyed with it, I like fun <T> foo() given TC<T> better than fun <T : TC> foo(), since the former clearly distinguishes inheritance relation from extensions. There is no confusion nor ambiguity.

Should we mark extension interfaces somehow? I do not see any need to. There is nothing a compiler can check with respect to them, but the fact that they should have at least one generic parameter. We can live with convention that type-classes ought to be such. The resolution rules underpinning the support for the given clause do not really care.

elizarov commented 6 years ago

We can also put aside, for now, my concerns about verbosity of typeclass instance (object) declaration. It looks like support for the given clause is quite powerful by itself for many practical use-cases.

We will not get the ease-of-use and conciseness that is comparable to Swift protocols, but if find this to be a big problem, then we can always add it later on. For example, we can introduce some kind of additional syntactic sugar that effectively expands this declaration:

class MyData(private val x: Int) deriving Dumpable<MyData> {
   fun dump() = println("MyData($x)")
}

Into this one:

class MyData(private val x: Int) {
   fun dump() = println("MyData($x)")
   companion object : Dumpable<MyData> {
       fun MyData.dump() = ... // bridges to dump in MyData
   }
}

Note, that this does not prevent a user-specific override of this extension. One can still define:

object CustomDumpableMyData : Dumpable<MyData> {
   fun MyData.dump() = println("Custom!!!")
}

and then use with(CustomDumpableMyData) { ... } to bring it into scope.

raulraja commented 6 years ago

@elizarov sounds good, I'll work on updating the proposal to have the revised syntax and we can go from there. Regarding derivation that is indeed a great way to get automatic instances from datatypes specially those that adhere structurally to the typeclass requirements. We do something similar in Kategory https://github.com/kategory/kategory/blob/master/kategory-core/src/main/kotlin/kategory/data/Option.kt#L9-L18.

Provided that a given datatype has certain instance members and companion static declarations we can automatically derive typeclass instances for those datatypes.

As for your comment regarding overriding makes sense and that is similar as to how Scala provides automatic resolution of implicits for the datatypes if their companion provides implicits and that is in the list of resolution lower priority than say local scopes. My concern originally was about the instance extending the type class and being its own instance but what you showed there makes sense.

Thanks @elizarov and @dnpetrov for considering this proposal. I'll update it with new syntax and await for direction on what to do next to see where we can take this. Myself and many others that are following this conversation are beyond excited about the possibility to have compile time verifiable type class instances in Kotlin.

raulraja commented 6 years ago

@elizarov @dnpetrov I updated the proposal to reflect the new proposed syntax with plain interfaces and using given to denote evidence where required. Please let me know if you'd like me to further expand in any area with more examples or clarifications. We are ready for next steps. Cheers :beers:

elizarov commented 6 years ago

It looks like extension class in the KEEP shall be replaced with object since it not seem to have any difference in semantics now.

Code samples need to be a little more explicit about how things are brought into scope, e.g. this example:

object IntMonoid : Monoid<Int> {
    fun Int.combine(b: Int): Int = this + b
    fun empty(): Int = 0
}

1.combine(2) // 3
Int.empty() // 0

shall be updated with either import IntMonoid.* or with(IntMonoid) { ... } to bing the function defined inside IntMonoid into scope.

The tricky part of this story is that an example right after that would require with(IntMonoid) { ... } to work and will not work with import IntMoniod.* unless we figure something specific about it.

fun <A> add(a: A, b: A): A given Monoid<A> = a.combine(b)
add(1, 1) // compiles
add("a", "b") // does not compile: No `String: Monoid` instance defined in scope

Is it Ok if you will basically have to write with(instance) { ... } the first time you need to be bring something into scope (inside the function with given it will not have any problems) or we'll need to work though the consistent syntax to import it?

cbeust commented 6 years ago

FWIW, I really like the idea that bringing the symbols into scope can be done either with import or with with, which provides a very fine grained import mechanism.

elizarov commented 6 years ago

Import is tricky. As of now, import IntMonad.* will bring all the corresponding extensions into scope. We can also use the same import statement to provide the scope for given function, but that raises a lot of nasty questions. What if my code style does not support imports with stars, and my import gets expanded into import IntMonad.combine (assume that was the only method I was using in this file). Now, how is it going to get into the scope of the add function with given Monoid<T> clause? Will it require import IntMonad.* (we can make sure import optimizer keeps it this way if you are using functions with given clauses).

Do we, maybe, need some new variant of import statement for that purpose or maybe we should use import IntMonad (no star) which will be somewhat inconsistent. I don't know good answer.

d3xter commented 6 years ago

I'd favour the import IntMonoid (no star) way, because in order to call fun <A> add(a: A, b: A): A given Monoid<A> you have to have the IntMonoid available in the current scope.

Looking at import IntMonoid.*, I'd think that it imports all of the methods from IntMonoid and not the object itself, ergo there is no IntMonoid in scope.

Ordinary objects behave that way too:

import MyInt.AdditiveMonoid //Brings *AdditiveMonoid* into current scope

fun main() {
    AdditiveMonoid.empty() //Possible because of the import statement
}

class MyInt {
    object AdditiveMonoid {
        fun combine() { }
        fun empty() = 0
    }
}
raulraja commented 6 years ago

@elizarov Added imports and replaced extension class in the proposal for extension object where possible except in those cases where the extension itself takes types args as in:

extension class OptionMonoid<A> : Monoid<Option<A>> given Monoid<A>

Another question is that if we plan on supporting instances declared as functions too. These would effectively become instance providers if they were imported in scope. For example:

package optionext
extension fun optionMonoid<A> : Monoid<Option<A>> given Monoid<A> = TODO()
import optionext.optionMonoid
import intext.intMonoid

Option(1).combine(Option(2)) //compiles
Option("a").combine(Option("a")) //does not compile because there is no evidence of `Monoid<String>` in scope

As for the imports issue I tend to agree with @d3xter since a explicit import will also help disambiguate instances and would make explicit in the file what extensions are getting applied.

import intextensions.IntAdditionMonoid

vs

import intextensions.*

Which can give you an ambiguous instances compilation error if more than 2 extensions are defined for Monoid<Int>

elizarov commented 6 years ago

We need to work out in a bit more detail the semantics of all the proposed language changes, that is go beyond examples into a more formal realm.

I do have a clear mental model on the semantics of the given clause (it simply brings additional receivers into scope, capturing them on the call site and bringing them back inside the function), but it all needs to be spelled out. We also need to have at least a sketch/draft of resolution rules and priorities in the presence of multiple imports, regular receivers, and additional receivers from various contexts.

However, I am still struggling to grasp the meaning of extension modifier. What does it really do? What different treatment extension object (or extension class) receives from compiler that a regular object/class does not?

elizarov commented 6 years ago

Btw, as for extensions with generic parameters like this:

extension[?] class OptionMonoid<A> : Monoid<Option<A>> given Monoid<A>

I think we can start with support for an explicit with(OptionMonoid<SpecificType>()) { ... } only. You can alway declare fully typed instance explicitly:

extension[?] object SpecificTypeOptionMonoid : OptionMonoid<SpecificType>()

and then enjoy the benefits of being able to import it.

The same consideration applies to instances declared as functions. I think it is simply too much implicitness (for a Kotlin taste) to have compiler invoke a constructor or another function behind the scenes without you having written an explicit code to do it.

raulraja commented 6 years ago

@elizarov I'll add a section on the docs that list each language change and proposed resolution rules for instances. Regarding:

However, I am still struggling to grasp the meaning of extension modifier. What does it really do? What different treatment extension object (or extension class) receives from compiler that a regular object/class does not?

If you don't have something like extension then the compiler can't verify you are declaring that as an extension or tell it apart from a regular subtype. Users may accidentally create extensions when they are actually intending to just subtype an interface. For example:

interface List<A>
class ArrayList<A> : List<A>

fun <A> doStuff(): Unit given List<A>

Would identifying ArrayList as a valid candidate instance for List<A> be ok? I think in that case ArrayList<A> is not a type class instance but a subtype of List<A> and should not be considered as candidate for instance resolution. It may also create confusion that this compiles at all if we have no way to disambiguate when people are referring to using subtypes of interfaces vs type classes and their instances which have different semantics. If you don't think this differentiation matters then it's fine to take out any keywords and the only introduced keyword in the lang would be given.

Regarding:

I think we can start with support for an explicit with(OptionMonoid()) { ... } only. You can alway declare fully typed instance explicitly:

how would that look in a polymorphic function like the one below where A is also a type arg?

fun <A> combine(a: Option<A>, b: Option<A>): Option<A> = 
    with (OptionMonoid<A>()) { //A is not concrete here
        a.combine(b)
    }

thanks!

elizarov commented 6 years ago

@raulraja Let me clarify my struggle with extension modifier and why I don't understand what it does. Consider the following hypothetical typeclass:

extension[?] interface Creatable<T> {
    fun create(): T
}

Now consider two ways to write a function that works in the scope of this typeclass:

fun <T> Creatable<T>.foo()       { ... /* body1 */ ... }
fun <T> foo() given Creatable<T> { ... /* body2 */ ... }

What is the difference between what you can do inside "body1" and "body2"? That is what I'm trying to understand (and it shall be clearly specified in KEEP).

I don't see much of a difference. In both cases you can use create() function inside the body. The only difference I see is that inside "body1" I can also write it as this.create(), which I cannot do inside "body2", unless I label this typeclass and use a "qualified this" expression. Does it somehow require marking Creatable interface declaration with some special modifier? I don't see why.

Now, let's get to your case with List<A>:

fun <A> doStuff(): Unit given List<A>

I do agree that writing a function like this does not make much sense (you'd be better of using a normal extension receiver), but I don't see why declaring such a function doStuff shall be forbidden.

Now, let's turn our attention to the call site. What if, despite its uselessness, I did declare this doStuff() function. What happens if I'm trying to invoke it? Will the following code work?

val a = ArrayList<Int>().apply { doStuff() }

I don't see why it should not work. We clearly have an instance of List in scope, and it clearly has the most priority among other lists we might have in our scope, so it should be captured for the invocation of doStuff(). Of course, if I simply write it like this:

doStuff()

then it will not work, because, by default, I don't have any object implementing List<A> in scope (in standard library). So, again, I don't see why we should be somehow marking either interfaces or objects with extension modifier. What kind of different treatment they shall be getting from compiler?

elizarov commented 6 years ago

@raulraja Regarding the polymorphic example:

fun <A> combine(a: Option<A>, b: Option<A>): Option<A> = 
    with (OptionMonoid<A>()) { //A is not concrete here
        a.combine(b)
    }

I frankly don't see what is the problem with OptionMonoid<A>() expression here. Let's dissect it step-by-step:

So, now we have an instance of Monoid<A> in scope and thus we can invoke combine function with the given Monoid<A> clause. It does not make a difference on whether A is concrete or generic.

raulraja commented 6 years ago

@elizarov Regarding the extension keyword my concern was more around usage and potential abuse of a pattern whose main purpose is to decouple type declaration from behaviors. Perhaps I'm biased because I have been using type classes for a while now. I have been always very concerned about when they are improperly used in langs like Scala which also does not make them first class and they are part of the trait and inheritance system as we are doing here with interfaces. I agree that, if in the spirit of not introducing new keywords or special treatment, opening them for this style of usages is not a concern then it's fine to remove the extension keyword. I'll remove it from the KEEP proposal :+1:

Regarding the issue with the polymorphic function I understood by your previous comment:

I think we can start with support for an explicit with(OptionMonoid()) { ... } only. You can alway declare fully typed instance explicitly:

that in this case A had to be concrete but if it works generically is fine.

Additionally it should not be necessary for the user to provide a concrete instance either because this would still be valid:

fun <A> combine(a: Option<A>, b: Option<A>): Option<A> given Monoid<Option<A>> = 
   a.combine(b)

Here we are not using OptionMonoid explicitly but the actual shape of the type classes required and that is enough information for the typechecker and compiler to verify that call sites of combine need to have a fully constructed instance in scope for example:

This works because there is a Monoid<Int> and a Monoid<Option<Int>> in scope:

import intextensions.IntMonoid
import intextensions.OptionMonoid

combine(Option(1), Option(2)) 

Here it can't construct a Monoid<Option<Int>> via OptionMonoid because there is no Monoid<Int> in scope:

//import intextensions.IntMonoid
import intextensions.OptionMonoid

combine(Option(1), Option(2)) 

Here it can't construct a Monoid<Option<Int>> because there is no evidence of it in scope:

import intextensions.IntMonoid
//import intextensions.OptionMonoid

combine(Option(1), Option(2)) 
raulraja commented 6 years ago

@elizarov @dnpetrov et al. I updated the KEEP with a section on language changes (which at the moment is just introducing the given keyword) and another section with the resolution order the compiler may consider when resolving instances at a call site.

elizarov commented 6 years ago

@raulraja Your "abuse" concern if an important and a valid one. Now, that explains the purpose of extension modifier, which is what I was missing before. We can consider (optionally) having a separate extension modifier to prevent abuse.

It would make sense, in this case, to mark interface, class, and object with this modifier and to enforce it in the following ways. Those are the most strict rules I can think of:

My own feeling about those restrictions is mixed. On one side it will definitely and radically prevent abuse. They (abusers) will not able to simply use the given clause as a way to pass additional implicit parameters of a arbitrary type into functions. They will have to mark the designated interfaces with extension, thus alerting people who read their code that they may get used as such.

Is this kind of abuse a big deal? I don't know. Is it going to be too restrictive if we have those restrictions? I don't know. I don't have much Scala experience to judge it, so I'd suggest to leave the extension modifier and the corresponding restrictions in a separate part of a proposal as an optional feature that might be added, clearly spelling out why we should consider adding it. It does make sense when the goal is spelled out.