keean / zenscript

A trait based language that compiles to JavaScript
MIT License
42 stars 7 forks source link

Typeclass objects #27

Open shelby3 opened 7 years ago

shelby3 commented 7 years ago

Typeclass objects offer dynamic (i.e. run-time and open modularity1) polymorphism by binding the typeclass bound to the static (compile-time) type of the reference and tagging the data type instances that the reference can contain, so that the correct dictionary can be dispatched dynamically at run-time.

They are named "trait objects" in Rust and "existential quantification" (e.g. forall a) or "existential typeclass" in Haskell.

1 Means we can't compile the universe, so we can't know all the data types required in dependency injection scenarios. Dynamic polymorphism is not just a run-time artifact, but it is also the fact of the real-world where total orders can't exist (i.e. the Second Law of Thermodynamics, Halting Problem, Incompleteness Theorem, Russell's Paradox, Byzantine Generals Problem, Byzantine fault-tolerance with CAP theorem, impossibility of existence without a quantified speed-of-light, etc). As well, there are no total orders of trustless, decentralization either.

Expression Problem

They have the benefit w.r.t. to Wadler's Expression Problem of delaying the binding of interface to data on construction of the typeclass object as opposed to OOP subclassing (~and Scheme/JavaScript prototype2~) which binds the interface to all instances of the subclassed type (on instantiation of the data). So they solve the Expression Problem up the point of instantiation of the typeclass object, but beyond that point while they allow adding new data types (via the aforementioned dynamic dispatch) they don't allow extending new operations after the instantation of the typeclass object which binds the allowed typeclass interface(s). Typeclass objects enable heterogeneous collections with dynamic extension of the collection of data types without requiring an invariant collections data structure (i.e. they can be the element type of an array). Whereas, typeclass bounds on function definitions delay binding the interface until the function call site and thus solve the Expression Problem up to that point, but they don't enable heterogeneous collections with dynamic extension of the collection of data types except via unions. Unions can both enable heterogeneous collections with dynamic extension of the collection of data types and delay binding of the interface until the function call site, but they require an invariant collections data structure such as a list.

2 Note that prototype programming can add new operations (but forces all instances of a prototype to take those new interfaces) and add new data types, but it can't add different interfaces to instances of the same prototype. ~Whereas, type classing can differentiate.~ (Edit: correction)

Syntax and Pecularities

@shelby3 wrote:

Well obviously that is useless and not what I was referring to. I mean for example, I have a collection of typeclass objects and I want to sort or find them, so I need a relation.

A relation only applies to two objects of the same type.

Exactly. So thus Eq<A, A> is useless for what I wrote above.

One answer is to provide a mapping from each object type to a partial order (or total order).

So akin to subsumption to a subclass, except open to delayed definition. That might work for sort.

Or another alternative is Eq<A, B> accepts any type B, but always returns false if B is not A. So find will work.

Okay I see how maybe typeclass objects are still useful. Thanks.

@shelby3 wrote:

@keean wrote:

data Variant = for<A, B> Variant(A) requires A : Eq<B>

data Variant = Variant(A | Any) where A: Eq<Any>

That seems to be a general solution and I like it very much, unless someone can point out flaws.

So typeclass objects are always conceptually with | Any, e.g.:

let x: A where Show<A | Any> = MyType(1, 2)
let y: A where Eq<A | Any, Any> = MyType(1, 2)

Or in my (and Rust's) preferred OOPish style:

let x: A where (A | Any): Show = MyType(1, 2)
let y: A where (A | Any): Eq<Any> = MyType(1, 2)

But I prefer we can write that shorthand (and so we only need where clauses on function types and definitions):

let x: Show = MyType(1, 2)
let y: Eq<Any> = MyType(1, 2)

Hope you are also starting to see another reason why I favored an OOPish style because it is also more consistent with typeclass objects (and I actually did intuitively anticipate/expect this but not with complete solution worked out).

P.S. my understanding is that any function defined for a typeclass bound would also accept typeclass objects with that bound. The compiler would create a dynamic dispatch version of the said function on demand to be used when ever a reference to it is passed around. Seems to me that only functions called nominally can be monomorphized.

shelby3 commented 7 years ago

@keean wrote:

This callback can only show the type A which is passed to f, the requirements should be at the top level.

f(callback: for<B> (B) => ()) where B: Show

Where the callback can be called on any type. The first example the callback is monomorphic with respect to f, in the second polymorphic.

I am instead proposing that be written the same as any typeclass bounded function without any for<B>:

f(callback: (B) => ()) where B: Show

I wrote in the OP:

P.S. my understanding is that any function defined for a typeclass bound would also accept typeclass objects with that bound. The compiler would create a dynamic dispatch version of the said function on demand to be used when ever a reference to it is passed around. Seems to me that only functions called nominally can be monomorphized.

keean commented 7 years ago

This:

f(callback: (B) => ()) where B: Show

Doesn't work because you are not specifying the scope for 'B' you could mean:

for<B> f(callback: (B) => ()) where B: Show

Or you could mean:

f(callback: for<B> (B) where B: Show => ())
shelby3 commented 7 years ago

@keean wrote:

Doesn't work because you are not specifying the scope for B you could mean:

I don't understand why that is important. Please explain.

keean commented 7 years ago

I don't understand why that is important. Please explain.

  • With the for<B> in the outer scope the callback is monomorphic with respect to f so the callback can only be called on one type, that is fixed at the time f is called.
  • With the for in the inner score we are passing a polymorphic function that can be called on any B
shelby3 commented 7 years ago

@keean wrote:

  • With the for<B> in the outer scope the callback is monomorphic with respect to f so the callback can only be called on one type, that is fixed at the time f is called.

Okay what you mean is that when f is called, then at the call site the types of the callback are monomorphized. This means f is only calling the callback with the said typeclass bounded instances and not with concrete data instances.

I don't think the for<B> is necessary. The scope can be determined by whether the where clause appears at the top-level scope or not! So non-top-level indicates a typeclass object; whereas, top-level indicates monomorphization at the call-site of f. So now, you provided another more important reason that we need non-top-level where clauses. I hate that ugly/noisy for<B>.

  • With the for in the inner score we are passing a polymorphic function that can be called on any B

Right that is what I wrote else where, that it doesn't make any sense for f to request a typeclass bounded callback (which isn't monomorphized at the call site for f) unless it is for dynamic dispatch typeclass object(s).

keean commented 7 years ago

Yes but I would rather make the programmer declare a datatype for a polymorphic callback function. This keeps the function syntax simpler, and is called first-class-polymorphism.

shelby3 commented 7 years ago

@keean wrote in other thread:

I think its a bit more of a problem. Its not the typeclass object that is needed. Consider:

f(callback : MyCallbackFn) where ...

We want to create the where clause when declaring f but we don't yet know the callback function that will be passed. We don't know what typeclass bounds the callback will have.

If we want the callback to print it may need a Show constraint, we cannot know at the point f is declared what typeclass bounds callback will have, and to define them would be to restrict what the callback is allowed to do, which does not seem right.

I am pleased you raised this point.

First re-read my prior comment and click the link in it to remember what you pointed out to me about the scoping of the callback.

If the callback is scoped by the call site to function f, then the caller chooses the typeclass bound(s), which satisfies your point.

And if the callback is scoped by the function f, then f decides the typeclass bound(s) of the typeclass object(s) of the callback.

What needs to happen is we need to infer the typeclass bounds on f when f is called with a specific callback which has known typeclass bounds.

I think the callback example needs type-class bounds inference at the callsite of f in order to be useful.

If you are not referring to augmentation of type-class bounds, that would only be in the case when the callback is scoped by the call-site of f, except it is generally impossible. Let's clarify what kind of inference we are taking about.

The function f may call some operations on the callback and/or it may store f in a data structure imperatively and thus we have no way of knowing which future operations on the callback will be allowed if f does not declare them. The compiler will force f to declare them. You could I suppose infer the operations that f internally requires in its function body using principal typings, but we don't know yet to what extent we can support principal typings, as generally they are undecidable in an open system.

So f probably must declare its typeclass bound(s) requirements (at least we decided at a minimum, it must be explicit at module boundaries).

If we want the callback to print it may need a Show constraint, we cannot know at the point f is declared what typeclass bounds callback will have, and to define them would be to restrict what the callback is allowed to do, which does not seem right.

But you are making a different point. You are talking about the caller of f being able to augment (not remove the typeclass bounds f requires) the available typeclass bounds that the callback requires in the case where the callback is scoped at the call-site of f. In this case, I agree with you, since you want to delay unification of the type variables until the call-site.

So what you are saying is that f can require operations on the type variables of the callback and the caller can add some more required operations.

The caller can add more where requirements (either inferred or explicit) at the call site of f. The way the explicit will be done at call site is with a cast either top-level:

f(callback): (MyCallback) => () [B: Print+Show]

Or non-top-level:

f(callback: (B) => () [B: Print+Show])

Note above that either will apply to the typeclass object or monomorphization cases, because per my proposal, f's scoping of its where clause determines which case it is.

shelby3 commented 7 years ago

@keean wrote:

Yes but I would rather make the programmer declare a datatype for a polymorphic callback function. This keeps the function syntax simpler, and is called first-class-polymorphism.

But that forces structural typing otherwise we can't reuse functions which have the same signature but different data type names. That is why I had suggested type aliases instead.

And type aliases must be expandable to any general types, so we won't be able to force (make) the programmer use type aliases for non-top-level where clauses (as they can use the general types any where a type signature is required).

Edit: more discussion.

shelby3 commented 7 years ago

@keean wrote:

@shelby3 wrote:

@keean wrote:

@shelby3 wrote:

@keean wrote:

By using first class polymorphism we can assume all function signatures are monomorphic which avoids the need to keep track of the nesting where type variables are brought into scope

Can you please provide an example or more detailed explanation so I can grok this? (Also for readers' benefit)

, and allows us to standardise on a single top level where clause for all datatypes and function definitions. We can still construct arbitrary higher ranked types by using datatypes.

See this paper, I think it explains it quite well. http://web.cecs.pdx.edu/~mpj/pubs/fcp.html

@keean wrote:

However I think we should use first class polymorphism so we don't have such complex function signatures. With first class polymorphism we can declare a data-type:

data MyCallback = (B) => () requires B : Show

f(callback : MyCallback)

Please show me example syntax for instantiating an instance of MyCallback?

Shouldn't that be instead:

data MyCallback = MyCallback((B) => () requires B : Show)

Yes, you are right, we require the constructor for the type to have () brackets. Basically it would be the normal monomorphic function declaration inside a datatype, and any type variable that is free on the right hand side gets automatically universally quantified.

As the paper says, it keeps the syntax simpler, is easier to implement, but has the same expressive power as free annotation of functions. I think it will make for cleaner APIs too, as you can use the named type to give check you have the right type for your callback (so less typing when using libraries).

Okay so the research paper you cited is all about achieving type inference.

So as I wrote in my prior comment in this thread, you are advocating abandoning the ability to reuse functions (which contain a type variable) based on their structural type and instead forcing said functions to be instantiated nominally (thus making equivalent structural function types non-composable because they have different nominal constructors), all because you want to be able use principal typings on these said functions. I've already warned and feared that you are going to making your principal typings goal into a weapon to defeat degrees-of-freedom of expression. @skaller has also warned similarly that type inference is the enemy of expressiveness.

I don't think it is often going to be the case that we want to have a function taking a callback, where both said function and the callback function are both not annotated with quantified (aka concrete) types. And for those cases where might have wanted to do that, we won't. We will have to annotate. For myself and also for readers, please provide me a convincing example showing this annotation overhead (in this specific case of type variables we are discussing) is a significant burden justifying abandoning the ability to compose functions generally?

Edit: also nothing stops you from using the method you prefer in your code and achieving type infererence. I am asking why we must force everyone to abandon function composition generality in all code, when they can choose to annotate.

I don't think I like complex code without type annotations. Raises cognitive load due to lack of explicit local reasoning. Type inference is useful where the reader can easily figure out in their head what the type annotations would have been. This is one of my big complaints that makes Haskell code so obtuse to read IMO.

keean commented 7 years ago

I don't think it is often going to be the case that we want to have a function taking a callback, where both said function and the callback function are both not annotated with quantified (aka concrete) types. And for those cases where might have wanted to do that, we won't. We will have to annotate. For myself and also for readers, please provide me a convincing example showing this annotation overhead (in this specific case of type variables we are discussing) is a significant burden justifying abandoning the ability to compose functions generally?

There is no loss in expressive power (it even says so in the paper), and it will make the type signatures shorter and more readable.

There is no loss in composition, and I am not sure why you would suggest there is? Function type signatures don't compose, unless you are talking about copy-and-paste composition? Copy-and-paste just leads to stupid errors. Why bother with named functions at all, why not just copy and paste the raw-code? The same argument applies to types, it is better to have a named type you can re-use (like you call functions) rather than have people copy-and-paste huge complex and unintelligible type signatures.

keean commented 7 years ago

Okay so the research paper you cited is all about achieving type inference.

I disagree, it is all about avoiding having to have higher ranked type annotations. In Haskell (and all other languages I know of except those that use FCP) you must annotate higher ranked types because they cannot be inferred. Haskell cannot infer them, no language I know of can infer them. So this is not about inference. This is about how you achieve the annotation. We already need first-class-polymorphism to correctly handle type-class dictionaries, so we already will have the mechanism for it.

We can keep the language simpler and keep the full expressive power, how could anyone not want that?

shelby3 commented 7 years ago

@keean wrote:

There is no loss in expressive power (it even says so in the paper),

I guess that is correct. I think I also used the term 'composition' and so it is really boilerplate that is the issue.

and it will make the type signatures shorter and more readable.

At the cost of causing other boilerplate...

There is no loss in composition, and I am not sure why you would suggest there is?

I can't reuse a MyCallback every where I can use a MyCallback2 (and vice versa) even if the signature of the functions are otherwise identical.

Granted I can destructure those constructors and re-construct a compatible instance, but this is boilerplate and additionally afaics if I have these in a typeclass object, then I will have need a different implementation for each one. Afaics, FCP is increasing boilerplate of composition in exchange for the hope of not having to write type annotations for higher-ranked types (and I say hope because I don't know if FCP inference will integrate with our unions and other features and I don't know if we won't end up needing higher-ranked types any way).

I disagree, it is all about avoiding having to have higher ranked type annotations.

Which is precisely all about achieving (a similar level of expressiveness with) type inference. But the entire point of type inference is to not have boilerplate. I hope you didn't really expect that FCP is some Holy Grail without trade-offs.

So this is not about inference. This is about how you achieve the annotation.

It is about how we support type variables with minimal annotation and boilerplate, with the goal of enabling type inference with a boilerplate wrapping versus losing type inference with no boilerplate wrapping.

We can keep the language simpler and keep the full expressive power, how could anyone not want that?

How is it simpler? You trade off one thing for another.

The same argument applies to types, it is better to have a named type you can re-use (like you call functions) rather than have people copy-and-paste huge complex and unintelligible type signatures.

The difference is we don't normally expect structurally equivalent types to be equivalent (excluding our planned unions and intersections). But functions we normally do expect them to be composable on structurally equivalence (which is why we allow anonymous functions).

Note you did not address my comment here:

@shelby3 wrote:

Edit: also nothing stops you from using the method you prefer in your code and achieving type infererence. I am asking why we must force everyone to abandon function composition generality in all code, when they can choose to annotate.

keean commented 7 years ago

How is it simpler? You trade off one thing for another.

You avoid nested where clauses in the syntax. The grammar will be simpler, (and this will be visible in the simple EBNF for the language).

The difference is we don't normally expect structurally equivalent types to be equivalent. But functions we normally do expect them to be composable on structurally equivalence.

We dont normally expect any equivalence operator to work on functions, so I am not sure this is important. Remember top-level functions can be fully polymorphic, so you only need to wrap the argument in the datatype at the call site. Like this:

data MyCallback = MyCallback((A) => A)

id(x : A) : A => x

fn(callback : MyCallback)

fn(MyCallback(id))
skaller commented 7 years ago

What is wrong with type annotations?

They’re great! They add to the ability to reason about code in principle. They allow error messages to make sense.

The downside of annotation is verbosity which makes code harder to read. The programmer should have flexibility chosing the tradeoff.

— john skaller skaller@users.sourceforge.net http://felix-lang.org

shelby3 commented 7 years ago

@keean wrote:

We dont normally expect any equivalence operator to work on functions, so I am not sure this is important.

Disagree. We expect assignment of functions to work on structural equivalence.

fn(MyCallback(id))

And so you are right back to type annotations again.

How is it simpler? You trade off one thing for another.

You avoid nested where clauses in the syntax. The grammar will be simpler, (and this will be visible in the simple EBNF for the language).

We avoid that but add the boilerplate back in other places. So isn't necessarily a net gain in brevity. It also loses some local reasoning.

The nested where clauses can be avoided without forcing FCP, by offering a type alias as I mentioned twice before.

And you didn't address my comment:

@shelby3 wrote:

and additionally afaics if I have these in a typeclass object, then I will have need a different implementation for each one

Generally wrapping interacts badly because it becomes deeply nested!!!

Which is why we want a first-class anonymous union instead of massive case-tree logic to destructure Either. OTOH, first-class without wrapping makes inference undecidable.

keean commented 7 years ago

Edit: also nothing stops you from using the method you prefer in your code and achieving type infererence. I am asking why we must force everyone to abandon function composition generality in all code, when they can choose to annotate.

Because it adds complexity to the implementation and the grammar, and complexity is the enemy of good software. Expressive power is achieving the most with the least.

What is wrong with type annotations?

Nothing is wrong with writing types, but the more complex the nesting of types, especially higher-ranked types is confusing. If we are aiming to make this easy for people coming from OO to pick up then we will want to make the type signatures as easy to understand as possible.

As I said above, this has nothing to do with inference, you have to write the types in either case, but the main difference is that with FCP all the quantifiers are implicit and all the where clauses are at the top level, simplifying the syntax, for no loss in expressive power. It makes the code cleaner and easier to read.

keean commented 7 years ago

and additionally afaics if I have these in a typeclass object, then I will have need a different implementation for each one

No a single record or typeclass object encapsulates all the polymorphic functions so you do not need this.

For example:

data MyCallback = MyCallback {
    fna (A -> A) : A
    fnb (A -> A -> A) : A
}

Works exactly the same as the single argument tuple datatype posted before.

keean commented 7 years ago

Generally wrapping interacts badly, which is why we want a first-class anonymous union instead of massive case-tree logic to destructure Either. OTOH, first-class without wrapping makes inference undecidable.

It has nothing to do with inference as you have to provide the type signature in either case, it is just where and how that signature is written that changes. FCP uses the same mechanism as type-class dictionaries.

keean commented 7 years ago

We avoid that but add the boilerplate back in other places. So isn't necessarily a net gain in brevity. It also loses some local reasoning.

Callbacks are not local in any case, as the callback could happen from anywhere. Consider async-IO, you call write with a callback, and flow immediately continues with the next statement. The callback will be called by some non-local code in the IO complete event handler. In JavaScript this can only happen after the current "program" has finished returning control to the main event-loop.

shelby3 commented 7 years ago

Readers to understand what we are discussing now, I think it is helpful to understand this way of emulating higher-ranked types which I was introduced to me at the Rust language forum:

trait Foo {
  fn call<A: Read>(&self, a: A);
}

fn f<G: Foo>(g: G, pred: bool) {
  if pred {
    g.call(File::open("foo").unwrap());
  } else {
    g.call(Cursor::new(b"foo"));
  }
}

By giving a name Foo to the type constructor for the function call, we are able to delay quantification of type variable <A: Read> until the body of the function f. Compare that to the example which failed to compile:

fn f<A: Read, G: Fn(A) -> ()>(g: G, pred: bool) {
  if pred {
    g(File::open("foo").unwrap());
  } else {
    g(Cursor::new(b"foo"));
  }
}

The above won't compile without higher-ranked types (which Rust doesn't support), because the quantification of <A: Read> is at the call-site for function f. So even if we did not place the : Read bound on A, the inference engine could not infer the correct type for A, which happens to be an intersection of two quantified functions where A is a File or a Cursor.

In general, inference of function application over unions and intersections diverges into infinite recursion in the inference and is undecidable. @keean showed me an example of that using expansion variables in private discussions last May. The two higher-order features are like mirrors facing each other, thus enabling mutual recursion.

keean commented 7 years ago

Generally wrapping interacts badly because it becomes deeply nested!!!

It is still better than the alternative of a huge multi-line type signature all in one block.

keean commented 7 years ago

@shelby3 looking at the direction you want to go, we may be better off ditching quantified types and type-classes altogether and going for a full intersection and union type system with ad-hoc overloading?

shelby3 commented 7 years ago

@keean I haven't made any decision. I am trying to analyze the trade-offs.

shelby3 commented 7 years ago

@keean wrote:

Generally wrapping interacts badly because it becomes deeply nested!!!

It is still better than the alternative of a huge multi-line type signature all in one block.

Disagree! The deep nesting totally breaks any reasonable composition.

Also we don't need to force FCP in order to get type aliases:

@shelby3 wrote:

The nested where clauses can be avoided without forcing FCP, by offering a type alias as I mentioned twice before.

keean commented 7 years ago

@shelby3 I feel you are making statements about a style you have no experience of coding, so I am not sure how much weight to give your objections.

keean commented 7 years ago

I haven't made any decision. I am trying to analyze the trade-offs.

Intersection types allow functions to be like a type-class where a single function can have different signature that apply to different types. You no longer need the boilerplate of declaring a type-class you just define two functions with different type signatures that have the same name.

shelby3 commented 7 years ago

@keean wrote:

and additionally afaics if I have these in a typeclass object, then I will have need a different implementation for each one

No a single record or typeclass object encapsulates all the polymorphic functions so you do not need this.

You misunderstood. The context of my point was MyCallback and MyCallback2 both having the same structural type, but requiring separate implementations for a typeclass bound, because they have different (nominal) types.

And then factor in deep nesting and you are in implementation proliferation hell.

keean commented 7 years ago

The context of my point was MyCallback and MyCallback2 both having the same structural type

Why would you declare two of them then? Thats like saying add and add2 both have different names despite both adding numbers exactly the same way. You don't need someone to stop you defining the add function more than once, why would you need someone to stop you defining "MyCallback" more than once. Its something you just wouldn't do.

skaller commented 7 years ago

As I said above, this has nothing to do with inference, you have to write the types in either case, but the main difference is that with FCP all the quantifiers are implicit and all the where clauses are at the top level, simplifying the syntax, for no loss in expressive power. It makes the code cleaner and easier to read.

Implicit quantifiers do NOT make code easier to read. Its like most “simplifications”, they seem like a good idea at the time, they work well for simple cases, but its a disaster precisely in the cases where you think it would serve best, the complex cases.

Its a known fact that human information transmission requires redundancy. Perhaps less for smarter people, or people familiar with the domain, more for others. But for language, psychologists have actually MEASURED the amount of redundancy that gives the best comprehension, its somewhere between 15 and 30% I believe (in linear text).

Computer programming languages may require different redundancy. but infering types everywhere clearly defeats the whole point of having type annotations: to provide redundant information for validation and confirmation.

— john skaller skaller@users.sourceforge.net http://felix-lang.org

shelby3 commented 7 years ago

@keean wrote:

The context of my point was MyCallback and MyCallback2 both having the same structural type

Why would you declare two of them then? Thats like saying add and add2 both have different names despite both adding numbers exactly the same way.

You have this big blind spot that repeats over and over even though I've mentioned to you 5 or more times that total orders don't exist in our universe.

In open modularity, you can't control the proliferation of duplication. It is impossible to maintain tight control over anything in a decentralized context.

keean commented 7 years ago

You have this big blind spot that repeats over and over even though I've mentioned to you 5 or more times that total orders don't exist in our universe.

If it is not a problem for named functions, it is not a problem for named types either.

keean commented 7 years ago

Computer programming languages may require different redundancy. but infering types everywhere clearly defeats the whole point of having type annotations: to provide redundant information for validation and confirmation.

What I know from experience is that a complete absence of types in code makes local reasoning harder and is a problem.

What I also know is that requiring explicit types everywhere is excessive boilerplate that gets in the way of what the code is actually doing.

So I think the sensible answer is somewhere in the middle, that types need annotating in some situations but not others. In order to minimise the cognitive load of understanding where annotation is necessary, the rules must be simple, generic, and easily applicable.

shelby3 commented 7 years ago

@keean wrote:

You have this big blind spot that repeats over and over even though I've mentioned to you 5 or more times that total orders don't exist in our universe.

If it is not a problem for named functions, it is not a problem for named types either.

I already argued to you why afaics that isn't normally the case:

@shelby3 wrote:

The difference is we don't normally expect structurally equivalent [nominal] types to be equivalent. But functions we normally do expect them to be composable on structurally equivalence.

We don't usually compose over the structure of nominal type constructors (and we can pattern match destructure when we need to). That is why we choose a nominal type system. But we don't normally choose a nominal typing system for functions, because we tend to want to deeply nested compose on the structure of functions.

I think perhaps this is because types other than functions are encouraged to be composed via operations, whereas functions are operations. It seems functions are the fundamental building block.

keean commented 7 years ago

@shelby3 If we are agreed that we can imply polymorphism in datatypes in exactly the same way we do in function signatures, in other words:

(A) => A

is a generic function over all A`s, so these are different:

data T<A> = T(A) // parametric type

data T = T(A) // universally quantified type

Then I am happy to leave the decision on what to do with function type signatures to you, although we probably need to discuss the syntax further if we are going to have higher ranked types directly in function type signatures.

shelby3 commented 7 years ago

@keean wrote:

Edit: also nothing stops you from using the method you prefer in your code and achieving type infererence. I am asking why we must force everyone to abandon function composition generality in all code, when they can choose to annotate.

Because it adds complexity to the implementation and the grammar, and complexity is the enemy of good software. Expressive power is achieving the most with the least.

I don't yet understand the scenarios where if we don't support higher-ranked types, then where with annotations we can't do something we need to do w.r.t. to where clauses not all at top-level, without resorting to wrapping in FCP??

Can you help me better understand? That FCP paper is too dense for me to translate quickly into examples and syntax I readily grok (and I am lacking time/energy).

shelby3 commented 7 years ago

@keean wrote:

We avoid that but add the boilerplate back in other places. So isn't necessarily a net gain in brevity. It also loses some local reasoning.

Callbacks are not local in any case, as the callback could happen from anywhere

You are conflating type (which would be local) with definition and invocation of the callback.

keean commented 7 years ago

Can you help me better understand? That FCP paper is too dense for me to translate quickly into examples and syntax I readily grok.

With FCP once the generic function is wrapped, it can be treated as a normal generic type, so for example:

id(myCallback)

The type of id remains (A) => A and yet myCallback with all its internal polymorphism and constraints behaves like any other first class value. Thats why it is called first class polymorphism.

You would have to construct special identity functions (and other generic functions) for each different higher-ranked type you wanted to deal with.

shelby3 commented 7 years ago

@keean wrote:

OTOH, first-class without wrapping makes inference undecidable.

It has nothing to do with inference as you have to provide the type signature in either case

Well yeah that is my point! But your cited paper claims otherwise:

We have described a modest extension of the HindleyMilner type system that offers both the convenience of type inference and the expressiveness of first-class polymorphism

skaller commented 7 years ago

On 10 Oct 2016, at 05:54, Keean Schupke notifications@github.com wrote:

So I think the sensible answer is somewhere in the middle, that types need annotating in some situations but not others. In order to minimise the cognitive load of understanding where annotation is necessary, the rules must be simple, generic, and easily applicable.

I agree. And the sensible middle ground is to allow type annotations to be omitted where they can be calculated bottom up, that is, entirely locally, with reference of names to surrounding context.

This means to me HM-inference is on the wrong side of the division because the types are calculated from unnamed usage. It also means overloading is suspect because you cannot be certain when you lookup a name you have found the complete overload set: type classes are a compromise here (the function is the class is unique, the instances are not, so the problem is reduced by splitting it into two parts).

This general argument applies to syntax too. (x) => (x) is worse than \x -> x because the \ announces a function, whereas ( may or may not, depending on context:

(x) => (x) => (x)

Woops! You didn’t see that second => before did you! Whereas this is even clearer but more verbose:

fun (x) => fun (x) => x

— john skaller skaller@users.sourceforge.net http://felix-lang.org

shelby3 commented 7 years ago

@keean wrote:

looking at the direction you want to go, we may be better off ditching quantified types and type-classes altogether and going for a full intersection and union type system with ad-hoc overloading?

I haven't made any decision. I am trying to analyze the trade-offs.

Intersection types allow functions to be like a type-class where a single function can have different signature that apply to different types. You no longer need the boilerplate of declaring a type-class you just define two functions with different type signatures that have the same name.

You used to mention this to me in private discussion and I think I had explained that this breaks the Expression Problem solution, because we can't add new data types independently of new operations.

The selection of a typeclass dictionary enables the data types and the functions which consume the dictionaries to varying orthogonally.

keean commented 7 years ago

This means to me HM-inference is on the wrong side of the division because the types are calculated from unnamed usage. It also means overloading is suspect because you cannot be certain when you lookup a name you have found the complete overload set: type classes are a compromise here (the function is the class is unique, the instances are not, so the problem is reduced by splitting it into two parts).

The algorithm I am using is compositional, that means it only uses local information, and typed program fragments are composable - there value part composes, and the type part composes separately.

keean commented 7 years ago

I think the biggest advantage of FCP is that generic templates can be treated as ordinary values, which makes them first class entities in the language.

skaller commented 7 years ago

@shelby3 wrote:

The difference is we don't normally expect structurally equivalent [nominal] types to be equivalent. But functions we normally do expect them to be composable on structurally equivalence.

We don't usually compose over the structure of nominal type constructors. That is why we choose a nominal type system. But we don't normally choose a nominal typing system for functions, because we tend to want to compose on the structure of functions.

I think perhaps this is because types other than functions are encouraged to be composed via operations, whereas functions are operations. It seems functions are the fundamental building block.

I agree in principle, however, there are SOME data types we want structural: tuples and records and arrays. Also, less common, unit sums.

Felix has structural variants, Ocaml polymorphic variants are partially structural, but normally we use nominal types for variants.

Nominally typed products are most useful for abstractions. However languages like C++ are absurd because all products have to be nominally typed except arrays.

Pointers are a good example of where nominal typing would be complete madness.

— john skaller skaller@users.sourceforge.net http://felix-lang.org

shelby3 commented 7 years ago

@keean wrote:

Can you help me better understand? That FCP paper is too dense for me to translate quickly into examples and syntax I readily grok.

With FCP once the generic function is wrapped, it can be treated as a normal generic type, so for example:

id(myCallback)

The type of id remains (A) => A and yet myCallback with all its internal polymorphism and constraints behaves like any other first class value. Thats why it is called first class polymorphism.

But unless id needs higher-ranked types, then it has specified the typeclass bounds on its signature, so there is no need to hide myCallback's internal type variables (to simulate higher-ranked types), because they are all quantified by id's signature.

So I am failing to see any advantage in simplicity for FCP unless we need higher-ranked types. Am I incorrect?

You would have to construct special identity functions (and other generic functions) for each different higher-ranked type you wanted to deal with

But what if we don't need higher-ranked types?

keean commented 7 years ago

But what if we don't need higher-ranked types?

You need higher ranked types when you want to pass a "generic function" as an argument. Without higher-ranked-types you can only pass concrete functions with known types as arguments into functions.

shelby3 commented 7 years ago

@keean wrote:

But what if we don't need higher-ranked types?

You need higher ranked types when you want to pass a "generic function" as an argument. Without higher-ranked-types you can only pass concrete functions with known types as arguments into functions.

Did you forget that I proposed that non-top-level where clauses on function definitions of callback arguments, would indicate a typeclass object. Thus there isn't any "generic function" (for monomorphization) being input, as it is dynamic polymorphism (one function for all dynamic types).

Afaics, I have nothing against using FCP for emulating higher-ranked types.

keean commented 7 years ago

Did you forget that I proposed that non-top-level where clauses on function definitions of callback arguments, would indicate a typeclass object. Thus there isn't any "generic function" (monomorphization) being input, as it is dynamic polymorphism (one function for all dynamic types).

"type-class objects" have a higher-ranked-type.

shelby3 commented 7 years ago

"type-class objects" have a higher-ranked-type.

Huh? Example please. I am thinking that the typeclass bound is specified by the function id and so all use of the callback within the body of id is using those statically declared bounds.

shelby3 commented 7 years ago

@shelby3 wrote:

"type-class objects" have a higher-ranked-type.

Huh? Example please. I am thinking that the typeclass bound is specified by the function id and so all use of the callback within the body of id is using those statically declared bounds.

The typeclass bounds are the nominal lifting to emulate the higher-ranked aspect of typeclass objects. You don't need another redundant wrapping data on the function that has typeclass objects.

keean commented 7 years ago

The typeclass bounds are the nominal lifting to emulate the higher-ranked aspect of typeclass objects. You don't need another redundant wrapping data on the function that has typeclass objects.

A "typeclass-object" type would look like this:

data MyObject = for<A> MyObject(A) requires Show A