the typeclass model offers superior extensibility

microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

https://www.typescriptlang.org

Apache License 2.0

99.27k stars 12.31k forks source link

the typeclass model offers superior extensibility #10844

Closed shelby3 closed 7 years ago

shelby3 commented 7 years ago

Subclassing inheritance is an anti-pattern (see also this and this and this).

For those who aren't familiar with nominal typeclasses in for example Rust and Haskell, they conceptually (without getting into details and caveats yet) are a way of adding implementation to existing nominal types without modifying the source for those existing nominal types. For example, if you have an instance of existing nominal type A and your want to invoke a function that expects a nominal type B input, typeclasses conceptually enable you to declare the implementation of type B for type A and invoke the said function with the said instance of type A as input. The compiler is smart enough to automatically provide (the properties dictionary data structure or in ECMAScript the prototype chain) to the function the type B properties on the type A instance.

The has efficiency, SPOT, and DNRY advantages over the alternative of manually wrapping the instance of A in a new instance which has B type and delegate to the properties of A. Scala has implicit conversion to automate, but this doesn't eliminate all the boilerplate and solve the efficiency (^{_{and tangentially note Scala also can implement a typeclass design pattern employing implicits}}). This disadvantage of wrapping with new instances compared to typeclasses is especially significant when the instance is a collection (or even collection of collections) of instances (potentially heterogeneous with a union type), all of which have to be individually wrapped. Whereas, with typeclasses only the implementations for each necessary pair of (target, instance) types need to be declared, regardless of how many instances of the instance type(s) are impacted.

And afaics this typeclass model is what the prototype chain in EMCAScript provides.

When we construct a new instance, the A.prototype of the constructor function A is assigned to the instance's protoype, thus providing the initial implementation for the instance (of type A) of the properties of type A. If we want to add an implementation of constructor function B (i.e. of type B) for all instances of type A, then we can set A.prototype.prototype = B.prototype. Obviously we'd like to type check that A is already implemented for B, so we don't attempt to implement B for A thus setting B.prototype.prototype = A.prototype creating a cycle in the prototype chain.

That example was a bit stingy thus contrived, because actually we'd want a different implementation of type B for each type we apply it to. And afaics this is exactly what typeclasses model.

I am very sleepy at the moment. When I awake, I will go into more detail on this proposal and try to justify its importance, relevance to TypeScript's goals, and important problems it solves.

Note I had recently discovered in my discussions on the Rust forum in May what I believe to be a complete solution to Wadler's Expression Problem of extensibility (the O in SOLID), which requires typeclasses and first class unions (disjunctions) and intersections (conjunctions).

There are 500+ detailed comments of mine (and @keean) over there (~335 of which are private) I need to reread and condense into what I want to say in this issue proposal. And it was to some extent and unfinished analysis that I had put on the back burner. I have elevated this priority seeing that TypeScript has the first-class unions and intersections and seeing that upcoming 2.1 is supposed to look into the nominal typing issue for #202.

I have mentioned typeclasses in a few of my recent comments on TypeScript issues.

shelby3 commented 7 years ago

@shelby3 wrote:

And do note that even on the untyped (uni-typed) ECMAScript, the prototype chain is form of heritage of objects since it is global to all constructed instances which didn't override the prototype as constructed (and even retroactively so which is the point I want to explore modeling with typeclasses for potentially amazing benefits in productivity and extensibility).

@shelby3 wrote:

@isiahmeadows wrote:

I'd recommend not trying to force nominal types into TypeScript in a way that requires frequent explicit casting in the meantime, though. If you focus on the data instead of the interface, you'd be surprised how far duck typing actually gets you.

I intend to convince[1] that subclassing is a non-extensible anti-pattern and typeclasses is the superior paradigm for nominal typing.

[1] See the THE COVARIANCE OF GENERICS section and the link back to a TypeScript issue.

SimonMeskens commented 7 years ago

I literally have no idea what you're talking about here, simple code sample? It sounds like you want to advocate changing an object's prototype. This is unfortunately discouraged in modern Javascript engines, as it disables a lot of optimizations. Hopefully this can be remedied in the future.

Do you want something like this?

class A <T> extends <T> {}

You can already type it, though it's somewhat clunky, like this:

class A {
    someProperty: string
}

class B {
    otherProperty: string
}

interface IC {
    newProperty: string
}

class C extends A {
    newProperty: string
}

let c: A = new C();

Object.setPrototypeOf(c, new B());

let newC: B & IC = <B & IC>c;

I agree that typescript could use a few new ways to correctly type mutating prototypes. I think my first code sample makes the most sense in that regard.

svieira commented 7 years ago

If I understand you correctly, you want to use compile-time prototype chain injection to provide the syntactical equivalent of Scala / Rust's typeclasses? In a project using the equivalent of scalaz / cats wouldn't that generate wildly deep prototype chains?

Also, wouldn't it make interop with JavaScript as painful syntactically as working with Scala is from Java? The compiler would have to re-write all of the names of all of the methods in the prototype chain to avoid collisions, which would mean JavaScript consumers would get to call things like someObject.$typeLookup(SomeNominalNéePrototype).someMethod() at best and at worst would have to do something like anotherUniverse.$x1z724_q3() // SomeNominalPrototype.someMethod as of yesterday.

gcnew commented 7 years ago

@SimonMeskens @svieira

In hindsight the idea of Type Clasess is a very easy/obvious one, but it was not when they were first conceived. The basic problem they are focused at is extending existing types with additional functionality. As they were targeted to solve problems in a functional language (Haskell) there were no classes or objects in the sense of Java, but only pure non-hierarchical types and values. None the less, people using this language wanted a principled way to extend a type with additional functionality. They came up with the idea of specifying an interface (called a Type Class) and implementing this very interface. The difference with other, more mainstream languages is that the instance (implementation) is not carried in an implicit class, but is actually a separate dictionary (to some extent parallels can be made with Extension methods in C#). Now, wherever something is declared to depend on that dictionary (e.g. a function that uses a type of specific Type Class), the compiler rewrites both the (function) definition and the call sites. The function definition is extended to accept the dictionary as a parameter while the call-sites are rewritten using type-directed emit to pass the appropriate dictionary.

For instance, lets say we want to be able to stringify arbitrary objects. We may come up with the following Type Class:

class Stringify a where
    stringify :: a -> String

Which in typescript terms corresponds to

type Stringify<a> = { stringify: (x: a) => string }

As an example we can implement this Type Class for string and number

instance Stringify Int where
    stringify = show

instance Stringify String where
    stringify = id

-- should be read as, a function accepting a value of generic type `a`
-- (which is also Stringify-able) that returns a string
appendX :: (Stringify a) => a -> String 
appendX x = (stringify x) ++ "X"

example = appendX (4::Int) ++ appendX "2"

const StringifyNumber: Stringify<number> = { stringify: x => String(x) }
const StringifyString: Stringify<string> = { stringify: x => x }

function appendX<a>(dict: Stringify<a>, x: a) {
    return dict.stringify(x) + "X";
}

const example = appendX(StringifyNumber, 4) + appendX(StringifyString, '2')

PS: see https://en.wikipedia.org/wiki/Type_class

SimonMeskens commented 7 years ago

@gcnew We are well aware of the concept of Type Classes, we are just confused as to why editing the prototype chain would in any way facilitate writing them in Javascript. Javascript, unlike Self seems to have advocated embedding over changing delegation from the very start and modern engines like V8 have only pushed this further by focusing optimizations on embedding. If the new setPrototypeOf gets significant use, I assume this will change.

I'm not against this suggestion, as I don't quite understand what's being suggested. The theory is not the problem, I understand how type classes are useful, especially for higher kinded polymorphism. I would just like to see some examples of real Javascript code that can't be typed today and how editing the prototype chain would make this possible.

gcnew commented 7 years ago

@SimonMeskens Well, in this case, I want to join you as well, because I'm also confused :). What's more, I don't see how Type Classes can be implemented in TS without breaking the stated Goals/Non-goals.

SimonMeskens commented 7 years ago

@gcnew I also think, with Javascript being more expressive than Haskell, type classes might not be the perfect fit, like they are for Haskell. It seems that certain concepts that can only be typed as higher kinds in Haskell can be flattened and typed more directly using prototypes; on the other hand, more complex behavior can be written in Javascript, that can not be typed in Haskell (I think, my Haskell knowledge is far from deep, even though I love the language). There's a few important gaps in Typescript at the moment that are less than perfect for doing things like currying (which enables a lot of these higher kinds to become useful) though.

The concept is intriguing for sure, I just don't get the suggestion made here :smile:

shelby3 commented 7 years ago

I will soon come back to this issue to elaborate. I am catching up on other discussion (and sleep). Thank you for the curiosity and discussion.

@shelby3 wrote:

@aluanhaddad wrote:

I would definitely be curious to see your formal type class proposal. ... Walking the prototype chain and doing reference comparisons is far from reliable.

I think that as you have yourself stated, a different language, one targeting a type aware VM may well be the only way to achieve your goals.

To the extent that TypeScript can embrace unreliability of expected structural (interface and subclassed) types at runtime (even possibly with optional runtime nominal and structural checks that exception to a default or error case), I think perhaps the similar level of typing assurances can be attained with typeclasses. Typeclasses would only make sense nominally, as otherwise they are same as the structural interfaces we have already.

And I believe typeclasses are much more flexible for extension, compared to subclassing. I intend to attempt to explain that soon in the issue thread I created for that.

If we are going to add some nominal capabilities that are compatible with JavaScript's existing paradigms, such as instanceof, then typeclasses would give us the flexibility of extension we get with structural types. Note that instanceof may become much more reliable.

P.S. you are referring to the comment I made about Google's SoundScript and that if they succeed to accomplish runtime soundness, I believe it will essentially be a much different language.

@shelby3 wrote:

@spion wrote:

I was not referring to subclassing. That was only the mechanism via which I illustrated the problem with nominal types.

Afaics, the problem you are concerned with only applies to subclassing.
Typeclasses only solve the initial step where you have to convince the owner to implement your nominal interface; you no longer need to do that - you can write your own instance. However, you still have to convince the first author as well as existing library writers to adopt your typeclass i.e. write their code as follows:
function f(x:TheNominalInterface) { ... }
rather then write their functions against the original nominal type written by the first author:
function f(x: TheNominalOriginalPromise) { ... }
Anyone can implement the typeclass for any preexisting data type. There is no owner any more with typeclasses. The data type can still encapsulate (via modules and/or class) any facets it wishes to, so there is an owner in that respect, but anyone can implement a typeclass interface employing the public interface of any type. It is all about composability and we build new interfaces on top of existing ones. Afaics, the main difference from subclassing is that we are encouraged to invert the hierarchy of inheritance, such that we don't implement a fixed set of interfaces conflated together in one class, but rather implement an unbounded future number of interfaces separately for any class. We change the way we design and think about our interfaces.

And this allows us many benefits, including being able to add an interface to collection of instances and feed it to a function without having to manually rebuild the collection wrapping each instance in a new delegate wrapper shell instance (or in the case of a dynamic runtime, add properties manually to each instance...messing with the prototype chain is more analogous to typeclasses afaics).

shelby3 commented 7 years ago

@shelby3 wrote:

@spion wrote:

I don't deny that typeclasses are better and more flexible than subclassing-based interfaces. But the problem doesn't just apply to subclassing.

Here is an instance of the problem in Haskell where despite having typeclasses, the community still haven't been able to fix the fragmentation that resulted from String, ByteString, ByteString.Lazy, Textand Text.Lazy. Libraries simply aren't adopting one common nominal interface: most still just use Text/String/ByteString directly, others implement their own personal and incompatible typeclass, and so on.

With structural types, you can implement the entire interface of an existing type and make all libraries that utilise the exiting type compatible with the new implementation. Without those libraries doing any changes whatsoever.

Per @SimonMeskens's suggestion, please continue this discussion in the issue thread I started for discussing typeclasses. I copied this reply over there. Thanks.

The problem you refer to (as well as Haskell's inability to do first-class unions) is because Haskell has global type inference. So this means that if we implement a data type more than one way on the same typeclass target, then the inference engine can't decide which one to apply. That problem is not with typeclasses, but with Haskell's choice of a globally coherent inference. Haskell's global inference has some benefits, but also has those drawbacks.

To enable multiple implementations (each for a specific typeclass) of the same data type on the same typeclass, we can support declaring typeclasses that extend other typeclasses. TypeScript's existing interfaces can serve this dual role I think. Then at the use site, we can provide a mechanism for selecting which implementation (i.e. sub-interface) is to be used. We can get more into the details in the future.

Suffice it to say that they are just as flexible as structural typing in terms of extensibility. The differences are they provide the ability to group structure nominal at any granularity we choose as an API designer. And to distinguish between equivalent structure that has a different name (nominal type).

spion commented 7 years ago

Then at the use site, we can provide a mechanism for selecting which implementation (i.e. sub-interface) is to be used.

Please do get into more details now. If the author of the Q promise library has written a function like so:

function doSomethingWithPromises(p:NominalTypeImplementedByQPromise) { ... }

How are you going to make this function work with a different promise library (say, Bluebird) without altering it, and without introducing a dependency between Bluebird and Q?

shelby3 commented 7 years ago

So I had the thought that one facet which distinguishes typeclasses from subclasses is that the object of member implementations (aka properties) for the instance this is either orthogonal (dynamic at use site) or bound (static at declaration site) to the data respectively. With subclasses, the aforementioned implementations object dictionary is static for the life of the instance and with typeclasses that object can change depending on use site need.

For JavaScript we can think of subclassing's static interfaces as the constructor function's default prototype chain.

For JavaScript we can place such implementations of the prototype chain at will, adding or removing implementations for all existent instances of the constructor function. For a statically compiled language, the compiler can manage this prototype chain to make sure the required implementations are presented for the need, e.g. for an instance supplied as an argument to a function which requires a specific typeclass interface. If any implementations are ever removed then employing a global prototype chain for all instances is not thread re-entrant nor event queue (e.g. asynchronous programming) compatible. A compiler would prefer to have a separate dictionary pointer supplied on use site need orthogonal to the instance pointer, which could also be simulated with JavaScript but perhaps isn't idiomatic. But JavaScript's per-instance prototype chain can I think suffice assuming implementations are only added and never removed.

I don't know to what extent this pattern of altering the prototype of properties post-construction has been used in the JavaScript ecosystem?

Another related issue is that type parametrized collections/containers (e.g. List or Array) whose element type is a heterogeneous collection of subclasses have been handled historically by subsuming to the greatest lower bound common type (which I also mentioned upthread) and relying on virtual inheritance so that all types in the collection are subsumed to a supertype of common methods and calling those methods using dynamic dispatch to call the implementation for the method of the concrete subtype. This pattern is extensible in one axis of Wadler's Expression Problem because we can add new subtypes to our collection without recompiling the source code for the supertype (interface) and without reconstructing a copy of pre-existing instance of the collection and its existing contained instances. However, it is not extensible in the other axis of the Expression Problem because we can't add a new method to the base class (supertype inteface) without recompiling the source code and reconstructing instances.

By employing first-class disjunctions a.k.a. unions (and anonymous unions which are a feature of TypeScript, eliminates a lot of boiler plate) for the aforementioned container class's type parameter, the concrete types of the heterogeneous collection are not subsumed (not lost); and that heterogeneous union type is applicable to any typeclass interface need (e.g. a function argument) if there exists an implementation of that typeclass interface for every concrete type in that said union. In this way we can have extension in both axes of Wadler's Expression Problem without recompiling and reconstructing the existing instances and their sources. Solving Wadler's fundamental Expression Problem maximizes extensibility.

However, there remains a problem in that any pre-existing instance pointer for the said heterogeneous collection will be invalid if a new concrete type is added to the container's type parameter, because a reference to number | string is not compatible with a concrete instance of type of number | string | Array<number|string>. Afaics, this problem can solved in two possible ways in lieu of cloning (but not deeply) the container. First, the container can be designed so that when a new concrete type is added, it returns a new reference to the new container type and this new reference is isolated from any external pre-existing references to container. Meaning that somehow the container's code has an internal state which tracks which issued references refer to which subsets of contained concrete types. The other solution I envision is the the compiler is smart enough to do some form of lifetimes tracking such as in Rust, and will insure that the pre-existing external references are invalidated when the new concrete type is added.

SimonMeskens commented 7 years ago

I feel like you need to write up some code samples. While I love talking type theory shop, without some real code samples, this discussion is too abstract for me. Could you create a toy sample?

shelby3 commented 7 years ago

@svieira wrote:

Also, wouldn't it make interop with JavaScript as painful syntactically as working with Scala is from Java? The compiler would have to re-write all of the names of all of the methods in the prototype chain to avoid collisions, which would mean JavaScript consumers would get to call things like someObject.$typeLookup(SomeNominalNéePrototype).someMethod() at best and at worst would have to do something like anotherUniverse.$x1z724_q3()

If the compiler could manage removing implementations after each use site need, then without multi-threading and asynchronous programming, we wouldn't have the problem you lament. But we need asynchronous programming, so we can't be sure the instance won't be referenced in multiple contexts simultaneously (unless we resorted to something like Rust lifetime's tracking which IMO is major pita). It seems to do this properly we need a separate implementations object dictionary for each use site (i.e. it's methods would consume an input for the instance and the instance's prototype chain wouldn't be used), then we could use the prototype chain of that object. Afaics, this would avoid your stated problem via locality and the compiler's knowledge about no conflicting names at the use site.

shelby3 commented 7 years ago

@SimonMeskens wrote:

It seems that certain concepts that can only be typed as higher kinds in Haskell

I'll need to review the discussion I had at the Rust forum as we discussed where the HKT and HRT (higher-ranked types) apply. Note TypeScript's new this self typing is a feature that emulates one feature HKT could do.

shelby3 commented 7 years ago

So again per my reply to @svieira, if we've implemented a hypothetical typeclass interface for a specific class, we can't attach that implementation to the prototype chain of the instance of the class, because there may exist other references to that instance in other asynchronous code paths which will require a different set of typeclass implementations simultaneously, where simultaneous is multiple asynchronous events and their code paths that are interleaved in order of execution.

So the methods of our implementations will not refer to the this on the object dictionary for the implementations and instead will input an function argument for reference to the instance. We can create the object for each typeclass interface implementation separately and then if the use site requires an intersection (aka conjunction) of typeclass interfaces, we can handle this orthogonally. It turns out that we don't use the prototype chain (because it would force us to set the prototype on each implementation object so we couldn't reuse these objects).

So let's write some hypothetical code presuming that we will reuse TypeScript's class for declaring data types and add trait for declaring typeclasses interfaces. To implement a class for a typeclass trait we won't employ an implements clause bound to the class declaration. Instead we'll need a new syntax for declaring the implementation of a pre-existing class.

trait Foo {
 foo(_: number): string
}

trait Bar {
 bar(_: string): number
}

class C {
  constructor(public value: number) {}
}

implements C for Foo, Bar {
  foo(x: number) { return new String(x + this.value) }
  bar(x:string) { return new Number(x) + this.value }
}

function foo(x: Foo) { return x.foo(0) }
function fooBar(x: Foo|Bar) { return x.foo(x.bar('1')) }
foo(new C(0))
fooBar(new C(1))

The emitted JavaScript will include (assuming trait names can't contain underscores):

const C_Foo = { foo(_this, x) { return new String(x + _this.value) }
const C_Bar = { bar(_this, x) { return new Number(x) + _this.value } }
function foo(x, C_Foo) { return C_Foo.foo(x, 0) }
function fooBar(x, C_Foo, C_Bar) { return C_Foo.foo(C_Bar.bar(x, '1')) }
foo(new C(0), C_Foo)
fooBar(new C(1), C_Foo, C_Bar)

Is anyone using a pattern like this in their JavaScript code already?

RyanCavanaugh commented 7 years ago

Why do the emitted foo and fooBar functions introduce shadowing C_Foo and C_Bar parameters? Why are you passing a number to bar, which claims to accept string? This code just throws exceptions so it's hard to understand what the exact intent is.

People might be doing this already by writing code like this:

interface Foo {
 foo(_: number): string
}

interface Bar {
 bar(_: string): number
}

class C {
  constructor(public value: number) {}
}

interface C extends Foo, Bar { }
// Could use Object.assign in ES6 here if we wanted to.
// In TypeScript 2.0, these functions would have `this: C` as their first parameter
C.prototype.foo = function(x) {
    return x + this.value; // n.b. Don't ever use new String / new Number!
}
C.prototype.bar = function(x) {
    return +(x + this.value);
}

function foo(x: Foo) { return x.foo(0) }
function fooBar(x: Foo & Bar) { return x.foo(x.bar('ok')) }
foo(new C(0))
fooBar(new C(1))

shelby3 commented 7 years ago

@RyanCavanaugh wrote:

Why do the emitted foo and fooBar functions introduce shadowing C_Foo and C_Bar parameters?

Because the instances of C_Foo and C_Bar may not be declared in same lexical scope as those use site functions, i.e. the shadowing won't generally exist. This example didn't show that.

Why are you passing a number to bar, which claims to accept string?

Typo. Fixed. Thanks.

C.prototype.foo = function(x) {
   return x + this.value; // n.b. Don't ever use new String / new Number!
}
C.prototype.bar = function(x) {
    return +(x + this.value);
}

That requires copying all the methods of each implementation into the instance every time the implementations are used. What if you have 100 methods on the implementations. Very costly to performance.

And you have to remove them from the instance when you are done using them at the use site of the functions foo and fooBar so the instance can be clean for attaching other interfaces at other use sites in order to enable the extensibility (swapping out implementations) of typeclasses, and as I explained in my prior comments your code is entirely incompatible with asynchronous programming if other use sites and references to your instance of C expect different implementations.

RyanCavanaugh commented 7 years ago

What I'm saying is, this code unconditionally crashes, so I can't reason about what your intent with it is. The parameter C_Foo is undefined in foo and the code throws an exception because undefined is not a function. What does a non-crashing version of the proposed emit look like?

shelby3 commented 7 years ago

@RyanCavanaugh apologies it is 7am and I haven't slept since yesterday morning. I fixed the typo. Thanks. Also please re-read my prior comment as I explained why your code sample is not equivalent.

SimonMeskens commented 7 years ago

As @RyanCavanaugh points out, I don't see how you add anything new to the language at all with that example. People are already writing that exact functionality by embedding onto the prototype of the constructor. There's a hint of extension methods going on, something that has been discussed and will be added depending on certain TC39 decisions.

The only thing I can see that's missing so far is this behavior:

    class A <T> extends T {}

Which would add a type-class-like functionality that lets us type changing prototypes more concisely.

shelby3 commented 7 years ago

@SimonMeskens wrote:

I don't see how you add anything new to the language at all with that example. People are already writing that exact functionality by embedding onto the prototype of the constructor

Re-read my reply to @RyanCavanaugh. I explained why his code is not equivalent.

SimonMeskens commented 7 years ago

I didn't say it was equivalent, I said your example adds no new functionality to the language imo, except for a weak version of already proposed extension methods.

RyanCavanaugh commented 7 years ago

@shelby3 please stop editing your comments to add substantially new information to them. It's extremely confusing. You write pages and pages of text, then edit them to forward-reference comments ahead of them (which makes the thread impossible to read top-down). And now I have to re-read every comment every time I go to write something to make sure you haven't addressed something new in an edit since the last time I left my own comment. It also makes it look like I'm pointing out things which are no longer true (or failing to address new objections which were edited in since I wrote my comment), which makes the thread more confusing to people reading it the first time through.

shelby3 commented 7 years ago

@SimonMeskens wrote:

I said your example adds no new functionality to the language imo, except for a weak version of already proposed extension methods.

Please show me how you can fix the problems I outlined to @RyanCavanaugh with the existing features of TypeScript.

@RyanCavanaugh wrote:

please stop editing your comments to add substantially new information to them

That is an unreasonable request. Sometimes that is not possible to adhere to. Please be tolerant.

It also makes it look like I'm pointing out things which are no longer true

I acknowledged the typos you discovered. Maybe Github can add a changes history for comments some day, similar to Stackoverflow has for Q & A and Steemit.com has every change is recorded on a public blockchain. I will not spam comments to simulate a missing change history feature, because the priority for readers is primarily comprehension and not blame. Github is showing you how long ago each comment was edited, so you can determine which ones you need to re-read.

@SimonMeskens also edited his comment and added this:

The only thing I can see that's missing so far is this behavior:

class A <T> extends T {} Which would add a type-class-like functionality that lets us type changing prototypes more concisely.

SimonMeskens commented 7 years ago

You can't, because what your code does comes down to extension methods and those are currently in TC39 limbo. If they are accepted, then yes, you can do what your code does and there's already several proposals on how to make Typescript type those correctly. If they are rejected by TC39, then there probably won't be extension methods in TypeScript due to the stated goals of the project.

In which way does your proposal change that situation?

RyanCavanaugh commented 7 years ago

In a structural type system that has to transpile away without type data, this basically can't be done. Your proposed emit works for easy cases but there's not a solution that works in an ergonomic way once aliasing and structural substitutions come into the mix. See the general problems outlined here: https://github.com/Microsoft/TypeScript/issues/9#issuecomment-74302592

RyanCavanaugh commented 7 years ago

Well I guess I get to read the entire thread over again to look for new edits. Back in 15 minutes.

shelby3 commented 7 years ago

@SimonMeskens wrote:

The only thing I can see that's missing so far is this behavior:
class A <T> extends T {}
Which would add a type-class-like functionality that lets us type changing prototypes more concisely.

I don't understand what are the intended semantics of that syntax of extending from a type parameter. Can you please explain it? Show how it would be used?

shelby3 commented 7 years ago

@SimonMeskens wrote:

You can't, because what your code does comes down to extension methods and those are currently in TC39 limbo. If they are accepted, then yes, you can do what your code does and there's already several proposals on how to make Typescript type those correctly. If they are rejected by TC39, then there probably won't be extension methods in TypeScript due to the stated goals of the project.

In which way does your proposal change that situation?

Can you show an example in current proposal for extension methods which is equivalent to my example?

My understanding is that extension methods do not allow changing the added methods at each use site independent of the instance. That is precisely the point of criticism I made against @RyanCavanaugh's code sample.

SimonMeskens commented 7 years ago

I don't understand that syntax. Can you please explain it? Show how it would be used?

I'd need to write up a proposal to explain how extending or implementing generic types would work, with useful code samples. Point is, I think type classes can basically model behavior that would come down to the extension of an unknown type, so that syntax makes sense to represent the issue. If you haven't run into a situation yet where you want to write that syntax and realize it wouldn't work, just disregard that comment.

Can you show an example in current proposal for extension methods which is equivalent to my example?

You can already write equivalent code in today's Javascript, as RyanCavanaugh showed in his sample. Your proposal doesn't add new functionality, hard to express functionality or a way to typecheck common behavior in today's Javascript. I'm just saying that extension methods would let you write the type class like this:

function foo(x) {
    return x + this.value;
}
function bar(x) {
    return +(x + this.value);
}

let c = new C(0);

c::foo(c::bar('1'));

Btw, I just went back up and saw that you edited old posts again, are you even reading what we write? You just cherry pick the bits you want to reply to, avoid all tough questions and rewrite old posts instead of replying. Don't do that please?

shelby3 commented 7 years ago

@RyanCavanaugh wrote:

In a structural type system that has to transpile away, this basically can't be done. Your proposed emit works for easy cases but there's not a solution that works in an ergonomic way once aliasing and structural substitutions come into the mix. See the general problems outlined here: #9 (comment)

Afaics, those problems you outlined are not a problem with my proposal. I will quote from your linked comment as follows:

You can't use extension methods as first-class objects, even though that's a very common JavaScript pattern. Only immediate invocations are allowed

Typeclasses are not a first-class object. They are a per-use-site (at function arguments only) dynamic overloading of the properties of the instance.

Extension methods aren't used when checking assignability, so you can't use them to add new functionality that is used with any indirection. You can't use an extension method to 'fix' a type to match some interface, even though that's a primary use case

Typeclasses are orthogonal to data type. We assign data types to data (class) or virtualized trait types, then at the use site the function can request arguments which are traits and the implementation for the input data type is passed into the function along with the instance of the data type.

Extension methods aren't visible when put in a union type, which everyone will assume is a compiler bug

Extension methods are conflated with an instance of the data type. Typeclasses are never bind (bound) to an instance. Virtualized typeclass traits can be added to a union or intersection type, and traits are not matched to interfaces and they have a nominal type. I haven't discussed virtualized traits yet.

If the type of a variable accidentally becomes any, your code will break at runtime with an error you won't be able to diagnose via debugging the code itself. This will look like a compiler bug to most people

Subsumption to any should never be allowed to come back into the static type system. That is an unsoundness in TypeScript. The type any should mean you are now coding in untyped ECMAScript. I understand soundness is not one for the stated goals for TypeScript.

You can't use TypeScript extension methods from JavaScript without knowing the special code to write

That will be true of any typeclass implementation. Why is that a problem?

shelby3 commented 7 years ago

@SimonMeskens wrote:

I'd need to write up a proposal to explain how extending or implementing generic types would work, with useful code samples.

Please make that proposal in your own issue thread if you ever do it, so we don't go off-topic here. You can then link to it from this thread. Thanks.

Point is, I think type classes can basically model behavior that would come down to the extension of an unknown type, so that syntax makes sense to represent the issue.

I don't think modeling an unknown type accomplishes anything similar at all. I will await your elucidation of your syntax.

If you haven't run into a situation yet where you want to write that syntax and realize it wouldn't work, just disregard that comment.

Okay.

You can already write equivalent code in today's Javascript, as RyanCavanaugh showed in his sample. Your proposal doesn't add new functionality, hard to express functionality or a way to typecheck common behavior in today's Javascript. I'm just saying that extension methods would let you write the type class like this:

Sorry afaics incorrect. See my most recent comment reply to @RyanCavanaugh. Afaics, extension methods do not accomplish what type classes do. They force binding the extensions to the instance. My code sample does not do that. Typeclasses associate to the class type, not bound to the instances of the class type.

Btw, I just went back up and saw that you edited old posts again, are you even reading what we write? You just cherry pick the bits you want to reply to, avoid all tough questions and rewrite old posts instead of replying. Don't do that please?

I have no idea what you are alleging. You can try quoting something that you feel was sneaky or disingenuous and link me to anything upthread which you feel was important that I didn't reply to. I really have no idea what you are referring to.

Please remember my policy against ad hominem attacks and other non-technical noise.

SimonMeskens commented 7 years ago

You asked for a functionally equivalent code sample, Ryan's is functionally equivalent to yours. Stop moving the goal posts. You say I'm incorrect and point to a comment that talks about performance (slow is not the same as incorrect) and things not demonstrated by your sample. There is nothing incorrect about the factual statement that Ryan's sample is 100% functionally equivalent within the bounds of the code sample, which is what you asked for.

shelby3 commented 7 years ago

@SimonMeskens wrote:

You asked for a functionally equivalent code sample, Ryan's is functionally equivalent to yours.

No it is not. And I have already explained why upthread.

Stop moving the goal posts.

I have not.

You say I'm incorrect and point to a comment that talks about performance (slow is not the same as incorrect) and things not demonstrated by your sample.

Sorry if you don't understand how my example can support the reimplementation at every function call. You can go study the Rust documentation for trait is you want to better comprehend.

There is nothing incorrect about the factual statement that Ryan's sample is 100% functionally equivalent within the bounds of the code sample, which is what you asked for.

Incorrect.

@shelby3 wrote:

Afaics, extension methods do not accomplish what type classes do. They force binding the extensions to the instance. My code sample does not do that. Typeclasses associate to the class type, not bound to the instances of the class type.

SimonMeskens commented 7 years ago

Right, okay, I'm out. This thread is becoming more and more like a Dali painting and less like constructive discussion. Good luck with trying to comprehend this episode of Doctor Who @RyanCavanaugh

shelby3 commented 7 years ago

I appreciate all the technical discussion. @RyanCavanaugh has provided some significant technical discussion today and I am interested if he or anyone else decides to continue technical discussion.

Please remember my policy against ad hominem attacks and other non-technical noise.

RyanCavanaugh commented 7 years ago

Let me try to summarize what I think you're describing in terms of the emit:

When a function takes a parameter whose type is a trait type, invocation sites must find a trait implementation matching the argument types and pass those implementations as additional arguments.

Is that correct?

shelby3 commented 7 years ago

@RyanCavanaugh correct. Those trait implementations must operate on the class type instance input. With typeclasses the trait implementations are associated to class type, not bound to instances as in the case of extension types.

RyanCavanaugh commented 7 years ago

The main problem, then, is that a function taking a trait parameter type is substantially different from a regular function in some very important ways (I will call them "traitfuncs" for simplicity going forward).

Off the top of my head:

It needs to be illegal to reference a traitfunc without immediately invoking it, because once it's aliased we can't determine whether or not to rewrite its call site
It needs to be illegal to overload a traitfunc in any non-trivial way (probably just at all rather than trying to figure out what the legal set of overloadings are)
Traitfuncs need to have different arity variance rules due to extra parameters appearing at the end
Traitfunc implementations need to be disallowed from looking at arguments at all because that data will be "wrong"
There need to be massive restrictions in whether or not a traitfunc call can span separate compilation units. This is probably a dealbreaker given their intended use case.
You can't call or apply a traitfunc

Thinking concretely, what would we emit for the code at the bottom here?

// Maybe we can standardize on this naming
// scheme for examples in this thread?
class A { a; }
class B { b; }
class Bd extends B { bd; }
trait T1 { x(): void }

implement T1 for A { x() { } }
implement T1 for B { x() { } }
implement T1 for Bd { x() { } }

// Code of interest follows:
function fn (arg: Array<A | B>) {
  arg.forEach(el => el.x());
}
let x: Array<A | B> = [new A(), new B(), new Bd()];
fn(x); // turns into what?

Are there new rules that would say Bd isn't actually a B anymore? Are we going to disallow the initializer for x there? How will we know to somehow inject T1_Bd in there so that its x gets called? Or will this be a non-virtual function call (which are completely unheard of in JS)?

shelby3 commented 7 years ago

@RyanCavanaugh your fn is not using typeclasses, so it shouldn't compile because the classes are missing the x() method. I think instead you intended to write the following:

function fn (arg: Array<T1>) {
  arg.forEach(el => el.x());
}

Rust uses instead trait objects (which do dynamic dispatch similar to virtual methods in subclassing) to handle that so you would change the type of x to a trait object of type Box<T1> and so would the type of arg: Array<Box<T1>>. But I don't like Rust's solution because it destroys extensibility in one axis of Wadler's Expression Problem and essentially is equivalent in extensibility to virtual methods in subclassing, because you can no longer use x for any trait type for which those 3 classes have implementations.

So I have proposed (in my discussion at the Rust forum this past May) a more complex compiler transformation monophism for that case, which afaik has never been implemented in any programming language. The compiler would have to create a switch guard on the implementations for 3 classes and pass that as the implementation input to the fn shown above.

Are there new rules that would say Bd isn't actually a B anymore?

Astute question.

Bd is still a B in terms of the concrete methods and members of Bd, but a Bd can not be input to a trait using the implementation for one its subtypes, because there is an ambiguity over which implementation to use and in the case that the implementation doesn't exist for Bd there is an ambiguity over whether to generate an error or use the implementation for B. Whereas, if an instance of Bd is assigned to a reference of type B (or equivalently an inline cast <B>), then there is no ambiguity. This is why we should also be allowed to provide an implementation of an interface for a trait as well as a class, i.e. when implementing a class for a trait, because unlike Java TypeScript has no final methods (thus all methods are always inherently virtual dispatch) thus it is equivalently implementing the interface for that class for that said trait.

I need to sleep and will make a new comment upon awaking to reply to your other thoughts.

shelby3 commented 7 years ago

@RyanCavanaugh I hope you can start to see how powerful these are when combined with anonymous unions as you have astutely provided as an example. And there is another benefit which is particularly applicable to a problem that TypeScript has. I pointed this out upthread, but it might have not registered with you at that time due to lack of context. That is we don't need to subsume to the greatest, lower bound (TypeScript refers to this as the "best type" when it won't accept any) for the type of a heterogeneous collection such as your Array example, and instead can retain the invariant union of the disjoint types A | B for let x. Thus the bivariant covariance heuristics and unsoundness of that (due to subclassing) in TypeScript falls away! That is a major win for TypeScript if they implement this. I will become a huge fan of TypeScript is this gets implemented due to the increase in typed extensibility. But typeclasses with heterogeneous collections have some other complications that I mentioned upthread:

However, there remains a problem in that any pre-existing instance pointer for the said heterogeneous collection will be invalid if a new concrete type is added to the container's type parameter, because a reference to number | string is not compatible with a concrete instance of type of number | string | Array<number|string>. Afaics, this problem can solved in two possible ways in lieu of cloning (but not deeply) the container. First, the container can be designed so that when a new concrete type is added, it returns a new reference to the new container type and this new reference is isolated from any external pre-existing references to container. Meaning that somehow the container's code has an internal state which tracks which issued references refer to which subsets of contained concrete types. The other solution I envision is the the compiler is smart enough to do some form of lifetimes tracking such as in Rust, and will insure that the pre-existing external references are invalidated when the new concrete type is added.

spion commented 7 years ago

@shelby3 have you looked at PureScript? edit: I see elsewhere that you have (and dismissed it).

I still think I think they offer exactly what you want, and their goals are more aligned with what you want to achieve.

svieira commented 7 years ago

Awesome - thanks @shelby3 for clarifying. The approach you suggest (providing an implementation of the type class(es) to the function taking in a typeclass type) is very Scala-like. Does this look like (roughly) what you're thinking of:

class A { a; }
class B { b; }
class Bd extends B { bd; }
trait T1 { x(data: number): void }

implement T1 for A { x(data: number) { log(`Received ${data} in T1 for A.x`); } }
implement T1 for B { x(data: number) { log(`Received ${data} in T1 for B.x`); } }
implement T1 for Bd { x(data: number) { log(`Received ${data} in T1 for Bd.x`); } }

// Code of interest follows:
// Note that you need to state that you are consuming T1 explicitly
function fn (arg: Array<T1>) {
  console.log(`Received ${arguments.length} arguments`);
  arg.forEach(el => el.x(123));
}
let x: Array<T1> = [new A(), new B(), new Bd()];
fn(x);

// This would have to turn into something like:
var $T1ForA = {
  x(instance, ...args) {
    log('Received ' + args[0] + ' in T1 for A.x');
  }
};

// Other implementations are skipped for brevity.

function fn(arg) {
  var $args = arguments;
  return $evidence => {
    console.log('Received ' + $args.length + ' arguments');
    arg.forEach(el => $evidence.for(el, 'x', 123));
  };
}
let x: Array<T1> = [new A(), new B(), new Bd()];
fn(x)({ // Note the compiler-generated dictionary we are passing in
  for($instance, $methodName, ...$args) {
    // Handwaving for a point
    const $thePreservedRuntimeType = $instance.prototype.constructor.name;
    switch($thePreservedRuntimeType) {
      case 'Bd':
        $T1ForBd[$methodName]($instance, ...$args);
        break;
      case 'B':
        $T1ForB[$methodName]($instance, ...$args);
        break;
      case 'Bd':
        $T1ForA[$methodName]($instance, ...$args);
        break;
      default:
        throw Error(`Unable to call ${$methodName} because no typeclass in scope for ${$thePreservedRuntimeType}`
    }
  }
});

Am I on the right track? (Apologies if this is going to implementation too soon - I think about the UX of the problem better when I can sketch a hand-wavy solution to see what it might look like in practice).

_{Also, I hope you got some sleep!}

spion commented 7 years ago

switch is a very poor idea here. It means you can't compile fn without knowing all the types that implement the trait T1. Which means that if you provide a library containing fn, users would need to recompile the library whenever they introduce a new implementation for T1.

edit: I see that the code would be generated at call site. That would work.

spion commented 7 years ago

Another idea that could potentially work is using a per-module dictionary of implementations of types used in the module. The dictionary would be $T1 and it would be an implicit argument to fn. Then for fn at the use site, the compiler can look at which instances are imported (only one instance per type allowed):

function fn($T1, x) {

arg.forEach(el => $T1[el.prototype.constructor.name].x(el, 123));

}

This ensures separate compilation. Its how you normally implement typeclasses (PureScript already does this).

What PureScript doesn't do (but they are considering it) is implement modular typeclasses/implicits. This would allow several instances, and the instances would be determined at the use site of the function:

import instance Bd(T1) from 'path1'
import instance B(T1) from 'path2'
import instance A(T1) from 'path3'

// Its important that within a module, adding two instance imports for the same type 
// is a type error. This avoids the confusion that arises in Scala.
// error: instance for A(T1) already imported
// import instance A(T1) from 'path5'

compiles to

$T1.Bd = require('path1')
$T1.B = require('path2')
$T1.A = require('path3')

Now when you invoke

let x: Array<T1> = [new A(), new B(), new Bd()];
fn(x)

you get:

fn($T1, x)

AFAIK Something similar is being considered in PureScript (called modular implicits). While its a cool feature, I still think its a really bad fit for a language with a structural, fully erased type system that intends to stay as close to JavaScript as possible.

SimonMeskens commented 7 years ago

What about interop from Javascript? What would that look like?

Also, do we need the trait keyword for the interface declaration or can we use trait at call site instead? I think this looks more like idiomatic TypeScript:

class A { a; }
class B { b; }
class Bd extends B { bd; }
interface T1 { x(data: number): void }

trait T1 for A { x(data: number) { log(`Received ${data} in T1 for A.x`); } }
trait T1 for B { x(data: number) { log(`Received ${data} in T1 for B.x`); } }
trait T1 for Bd { x(data: number) { log(`Received ${data} in T1 for Bd.x`); } }

// Code of interest follows:
// Note that you need to state that you are consuming T1 explicitly
function fn (arg: Array<trait T1>) {
  console.log(`Received ${arguments.length} arguments`);
  arg.forEach(el => el.x(123));
}
let x: Array<trait T1> = [new A(), new B(), new Bd()];
fn(x);

spion commented 7 years ago

Its clearly not optimal for JS interop. Which is why I'm still waiting for that idea to see how per-object prototypes would help here.

To me it looks like they would bring a sort of "dynamic" scope for typeclasses - instead of the current lexical scope determining the value of the function dictionary, the actual object values would. (That sounds worse than Scala's implicits to me)

keean commented 7 years ago

Seeing as I am cc'd into this topic, I might as well give some thoughts.

One of the key properties of type-classes (and why they are used in Rust for zero cost abstraction) is that they monomorphise if you know the types statically (which we mostly do). Leaving aside the case where there is true dynamic (runtime) polymorphism, most polymorphism is resolvable statically.

In such cases you can rewrite the code (by AST manipulation in the compiler) to remove the type-classes. You can understand this as type-classes providing something approximately like type-safe versions of C++ templates.

Now the above case with the list is kind-of tricky because it 'forgets' the types of the elements put into the list by type erasure. This requires what Haskell calls an existential type, and Rust calls a trait-object. This detail is effectively missing in the above descriptions.

For the above when we declare an array of type Array we are promising that all elements in the array implement T1. We prove this by providing a witness, but in implementation terms this witness is an instance of the type-class dictionary providing implementations of T1. So what actuall goes into the array for each element is a pair of the type-class dictionary and the object. In JavaScript this 'could' be implemented by requiring all objects inserted into the array to have the 'class-object' for their implementation of T1 in their prototype chain.

In the above example you cannot just pass a single dictionary "$T1" into the function 'fn' because each of A, B and Bd need a different dictionary, and you have no idea which types are in the Array<trait T1> because they could be any type that implements T1, even a type not declared yet, declared in another module, etc. Note: this is not a problem if each object in the array provides the required dictionary.

Really it does not matter how the dictionary is provided, type-classes are sort of like a formalised duck-typing. All "Array<trait T1>" really does is promise that any object inserted into the array (and this can be checked at compile time) provides an implementation of T1 (so the methods whose interface is defined in T1 could be prototype methods, or methods directly in the object itself, as long as "x(data: number): void" is callable on the object).

More formally, type-classes (traits) are constraints on types, so a type signature containing a type class and a polymorphic variable would be written like this in Haskell "forall x . T1 x => Array x" which is saying the array can contain type type 'x' such that 'x' is constrained to implement T1. The 'forall' introducing 'x' as a scoped type variable means that 'x' cannot be accessed from outside its scope (this statement) formalising the type erasure (we cannot know which concrete type 'x' is, we can only know it implements T1 ).

The interesting thing however is that if I were to call "new A().x(123)" directly, we statically know the type of "A" so we do not need to pass a trait-object at all. The compiler can perform a direct template instantiation substituting "log(Received ${data} in T1 for new A().x);" into the code directly. This is how type-class based languages can achieve zero-cost abstraction (in the common case where there is no dynamic polymorphism).

SimonMeskens commented 7 years ago

We know that a system that relies on emits with sub-optimal JS interop is simply against the goals of the TypeScript project. The proposed code sample simply won't make it in in any shape or form.

If we deconstruct the samples, I see two things: it looks like a way to do multi-methods and a way to do extension methods rolled into one. We can deconstruct it into the following parts:

A way to do multi-methods: emit-heavy
A way to do extension methods: hard problem to solve, see extension method discussion
A few new constructs on top that let us do implicit local extension multi-methods

What this proposal comes down to is that third part, but since that needs the other two parts to work, this becomes a big monstrous thing to tackle. I think we should try to split the samples into these three parts and then figure out if there's a way to get the first two into TypeScript. The biggest problem is probably multi-methods. They cause the emits and break interop.

I'll elaborate with some samples when I get home.

keean commented 7 years ago

Actually type-classes provide type-safety for duck-typing, which seems well suited to JavaScript. Note C++ templates which are duck-typed are due to get type-safety from C++ "Concepts" which are yet another name for type-classes (traits).