keean / zenscript

A trait based language that compiles to JavaScript
MIT License
42 stars 7 forks source link

Subtyping #8

Open shelby3 opened 7 years ago

shelby3 commented 7 years ago

Union types express a subtyping relationship, but I am unclear as to whether typeclasses (i.e. Rust's traits) do?

If a trait B extends another trait A and B reuses the implementations of A, can we assign a trait object that has a bound B to a trait object that has a bound A?

Seems the answer based on prior discussion is yes. But that is a subtyping relationship, which means we would need to deal with covariance on type parameters both when they are trait objects and when they are unions. Correct?

Prior discussion: https://github.com/keean/zenscript/issues/6#issuecomment-248711828 https://github.com/keean/zenscript/issues/1#issuecomment-248113585 https://github.com/keean/traitscript/issues/2#issuecomment-248021713 https://github.com/keean/zenscript/issues/1#issuecomment-248754649

shelby3 commented 7 years ago

@keean wrote:

It will be interesting to see if we still need-trait objects

They are the only way (other than subclassing) I can see to get variance in the polymorphism, i.e. we can assign new types into the trait object (Rust's name for an "existential type" with a typeclass bound) without breaking the invariance of the Array.

The advantage of trait objects compared to subclasses, is that types that can be inserted into a trait object don't have to conform to a pre-existing subclass hierarchy, i.e. it solves some aspect of the Expression Problem¹ but doesn't solve the scenario that unions solve. I am thinking we can view trait objects and unions as complementary in functionality offering different trade-offs on the multi-dimensional design space.

  1. Specifically when we use trait objects for polymorphism, we can add new operations to existing data types (i.e. the trait bound of the trait object), and we can add new data types to existing operations (i.e. implement data types for existing traits), but we can not add new operations to an existing trait object (because the data types have been erased from the static compile-time knowledge of the trait bound type of the reference when assigned to the trait object). And that is the limitation of trait objects that my solution addresses with unions and delayed binding of the union to a trait bound at the function call site. OTOH, unlike unions employed in my solution, trait objects do not require invariant containers in order to add heterogeneous data types to the container because the type of the trait object is invariant. An invariant container is for example a List. Trait objects would also work with an Array which is not an invariant data structure. Apology I am not employing academic terminology such as kinds, rank, F-bounded, etc.. Someone else can do a better job of translating my solution into that terminology.
keean commented 7 years ago

@shelby3 I am thinking along the same lines. As far as I understand it there is no subtyping. With type classes when we say trait B extends trait A we mean that any implementation of trait B for a type requires there to also be an implementation of trait A, but that is it.

This applies to trait-bounds (type constraints) and trait-objects. Trait-objects are not really a valid type-system concept, these are more correctly called existential types. In haskell if we have an existential data type like this:

data Singleton = forall a. (TraitB a) => Singleton a

The forall a introduces a new scoped type variable, and because its scope is the container, we cannot ever from outside the container know the type of a, but it is constrained to implement trait B and that implies it also must implement trait A. This means we can call any function from the interfaces trait A and trait B on the contents of the container. An alternative way of writing the above (not valid Haskell) is:

data Singleton = Singleton (exists a . (TraitB a) => a)

which is where the term 'existential type' comes from. Generally forall corresponds to the logical concept of any (but may be none), and exists corresponds to the logical concept of some (but at least one).

shelby3 commented 7 years ago

Agreed no subclassing issue on the type parameters of a typeclass.

shelby3 commented 7 years ago

@keean wrote:

I didn't see any mention of type-classes when I looked at Ceylon. It also looks like Ceylon provides classical object inheritance and subtyping, which are both things ZenScript aims to avoid to keep the language small and simple.

Don't we need to be careful about differentiating between subclassing versus subtyping?

ZenScript will have subtyping because it will offer structural unions. An Int is a subtype of an Int|String. ZenScript will not have subclassing.

Ceylon has anonymous structural unions, but it doesn't have typeclasses. <truthful hyperbole>Also it has that Jurassic, ossifying, rigor mortise, anti-pattern subclassing paradigm, which will infect with that said virus any program written with Ceylon. Ditto every statically typed language that offers subclassing including Java and Scala (because programmers won't be disciplined enough to not use the subclassing virus in their design patterns).</truthful hyperbole> :stuck_out_tongue_winking_eye:


Update:

@naasking wrote:

Ceylon models Java's subtyping via first-class union and intersection types. It's not at all a classical subtyping model.

He replied as quoted above after I wrote the above. Please don't use the word 'subtyping' where you really mean 'subclassing'.

@naasking is apparently unaware of what the programmer can't accomplish with unions and intersections in terms of the Expression Problem if the language does not have the delayed binding of implementation to interface that typeclasses provide.

shelby3 commented 7 years ago

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List<A> to track the type of the elements stored in the list.

Thus we must use a data type to express this:

data List a = Nil | Cons a (List a)

Or add member property names if don't want to be forced to destructure with pattern matching:

data List a = Nil | Cons { head :: a, tail :: (List a) }

In ZenScript perhaps:

data List<A> = Nil | Cons(A, List<A>)

Or perhaps:

interface List<A>
singleton Nil implements List<Never>  // Never is¹ the bottom ⊥ type
sealed class Cons<A>(head: A, tail: List<A>) implements List<A>

Note afaik, Haskell's data type expresses a 'hasA' not an 'isA' subclassing relationship between Cons and List and a can not be heterogeneous because Haskel doesn't have a first-class union type (their global inference doesn't allow for it). If we use this syntax in ZenScript then when instantiating a Cons("string", Cons(1, Nil)) then the type will be inferred List<Number|String>. And if we instantiate first a let x = Cons(1, Nil) its type will be inferred List<Number>. And if we then instantiate Cons("string", x) then the type will be inferred List<Number|String>.

But note even though the above is subtyping, there are no virtual methods on data types, thus no virtual inheritance and no subclassing. Of course any function can input a data type, so that is synonymous with a method, except it isn't virtual dispatch.


  1. Since an eager, CBV language makes Bottom an effect, I have argued the name of the Bottom type should be Never and not Nothing (which appears to be the choice TypeScript made). Bottom is for example the type of functions that never return (terminate). Whereas, in a lazy, CBN language such as Haskell, Bottom becomes a value, so I argue it should be named Nothing and not Never.
keean commented 7 years ago

Or, with types

data Nil = Nil
data Cons<A, B> = Cons<A, B>
trait List A
impl List Nil
impl<A, B : List<A>> List Cons<A, B>
keean commented 7 years ago

As a single type with multiple constructors (tagged union):

data List<A> = Nil | Cons<A, List<A>>

I'm still not sure about which keywords we should be using for trait, impl and data.

shelby3 commented 7 years ago

@keean wrote:

data Cons<A, B> = Cons<A, B>

data List<A> = Nil | Cons<A, List<A>>

That seems incorrect. It doesn't tell me how to construct an instance of Cons. Mine was correct:

data List<A> = Nil | Cons(A, List<A>)

And we can add member property names if we don't want to be forced to employ pattern matching to destructure:

data List<A> = Nil | Cons(head: A, tail: List<A>)

shelby3 commented 7 years ago

@keean wrote:

data Nil = Nil
data Cons<A, B> = Cons<A, B>
trait List A
impl List Nil
impl<A, B : List<A>> List Cons<A, B>

That seems incorrect. I think it should be instead:

data Nil = Nil
data Cons<A> = Cons(A, List<A>)  // Edit: `List` is a typo and should be `Cons` per subsequent discussion
trait List
impl List Nil
impl<A> List Cons<A>

Remember the trait List should know nothing about A if its methods don't specialize on A.

@shelby3 wrote:

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List<A> to track the type of the elements stored in the list.

Note I'd prefer to write that:

pluggable List
implement Nil for List
implement Cons<A> for List   // no need to write the type parameters twice per my
                             // proposal¹ that all UPPERCASE names are type parameters

Or much better:

pluggable List
Nil implements List
Cons<A> implements List

I think the last is best because it remains very similar to Java, yet we change the meaning of what is being implemented from interface ('isA' relationship) to pluggable ('hasA' relationship).

Thinking about a typeclass as a pluggable interface seems very intuitive. We can't use Rust's trait because trait has a different meaning in several other languages.

Q: "What is a pluggable API?" A: "It means that you can replace the implementation."

Remember we both decided that clarity trumps brevity (e.g. implement instead of impl), especially for syntax which is not expression-level (because such declarations won't appear often because most frequently appear in source code are expressions).

  1. https://github.com/keean/zenscript/issues/6#issuecomment-248742773
keean commented 7 years ago

Not quite :-) some things to discuss. You have given value constructors round brackets, that seems okay to me.

data List<A> = Nil | Cons(head: A, tail: List<A>)

Normally the arguments to cons are positional like function arguments, and deconstructed by pattern matching. You would use record syntax to name them, so either of the following:

data List<A> = Nil | Cons(A, List<A>)
data List<A> = Nil | Cons {head: A, tail: List<A>}

We don't have to stick to that but it's how I was thinking.

This has more problems:

data Nil = Nil
data Cons<A> = Cons(A, List<A>) // list is not a type
trait List
impl List Nil
impl<A> List Cons<A> // cons needs two type parameters

So correcting this:

data Nil = Nil
data Cons<A, B> = Cons(A, B) // constraints on data bad.
trait List<A>
impl List<A> for Nil
impl<B : List<A>> List<A> Cons<A, B>

Note this is still Rust syntax that gives special treatment to the first type class parameter, and I am not sure that is best, but let's have a different topic for that when we have agreed this.

shelby3 commented 7 years ago

@keean wrote:

You have given value constructors round brackets, that seems okay to me.

Yeah to differentiate them from type constructors, and because value constructors in the Java-like languages use round brackets (aka parenthetical grouping).

Normally the arguments to Cons are positional like function arguments, and deconstructed by pattern matching. You would use record syntax to name them, so either of the following:

You are repeating what I wrote. I even linked to the Haskell record syntax upthread.

However the following is not naming the members of Cons (rather is only providing their positions and types), and can only be destructured with pattern matching as I already wrote in my prior comment:

data List<A> = Nil | Cons(A, List<A>)

And to stick with the Java-like syntax (and not mixing in Haskell syntax), I would prefer the following which I think will be much more clear to mainstream programmers coming from popular programming languages:

data List<A> = Nil | Cons(head: A, tail: List<A>)

The following is mixing a JavaScript unnamed Object with some new concept of a tag of Cons, which has no analogous concept to people using JavaScript or OOP languages (and our guiding principle is not to introduce unnecessary syntax, i.e. the new concept of a { ... } where we don't need to):

data List<A> = Nil | Cons {head: A, tail: List<A>}

shelby3 commented 7 years ago

@keean wrote:

This has more problems:

data Nil = Nil
data Cons<A> = Cons(A, List<A>) // list is not a type
trait List
impl List Nil
impl<A> List Cons<A> // cons needs two type parameters

I had a typo and Cons does not need two type parameters (two would mess up other things):

data Nil = Nil
data Cons<A> = Cons(A, Cons<A>)
trait List
impl List Nil
impl<A> List Cons<A>

The type of the type parameter A in Cons<A> will be subsumed to the GLB of the union of two types used to construct a Cons. I had already explained that as follows.

@shelby3 wrote:

If we use this syntax in ZenScript then when instantiating a Cons("string", Cons(1, Nil)) then the type will be inferred List<Number|String>. And if we instantiate first a let x = Cons(1, Nil) its type will be inferred List<Number>. And if we then instantiate Cons("string", x) then the type will be inferred List<Number|String>.

Note in the above quoted text, I was referring to a data type List not a typeclass List. Refer to that quoted comment for the declaration I employed there (which differs from the List in this comment).


@keean wrote:

So correcting this:

That is still incorrect. You have a type parameter A on typeclass List which I already explained (and you even agreed!) is incorrect as follows.

@shelby3 wrote:

Remember the trait List should know nothing about A if its methods don't specialize on A.

@shelby3 wrote:

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List<A> to track the type of the elements stored in the list.

Follow the link in the above quote to see where you had agreed. In fact, you were the one who explained the issue to me. And now it seems you forget what you explained to me.

Actually the above is revealing a deeper issue to me about higher-kinds which I had realized when I woke up this morning. I am preparing to write about that.

keean commented 7 years ago

The {} have the same use in 'C' for structs, C++ and Java for object definition so they are not new as such.

In C, C++ and Rust we would write:

struct Cons { 
    head : Int, // this syntax is different for C
}

Which we are writing:

data Cons = Cons {
    head : Int
}

Or

data Cons = Cons (
    head : Int
)

I am happy with either, providing the field names are optional, but I wanted to point out that the data statement can be viewed as an extension of struct and object definition.

keean commented 7 years ago

@shelby3 wrote

I had a typo and Cons does not need two type parameters (two would mess up other things):

Yes it does, what you wrote cannot ever end in a Nil.

keean commented 7 years ago

@shelby3 this version is correct in Rust syntax:

data Nil = Nil
data Cons<A, B> = Cons(A, B) // constraints on data bad.
trait List<A>
impl List<A> for Nil
impl<B : List<A>> List<A> Cons<A, B>

Cons needs a second type parameter because B can either be another Cons or a Nil which are different types.

The trait List needs a type parameter for the type that in the list, which is not the same as the type which is a member of the class (that is Nil or Cons<A, B>)

When we add Cons to the List type class we need to constrain B to be in the List type class so that you cannot put any random type as the second Cons parameter.

keean commented 7 years ago

@shelby3 wrote:

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List to track the type of the elements stored in the list

I guess I was wrong, a multi parameter type class can represent an arbitrary relation on types. I must have been sleepy when I agreed :-)

shelby3 commented 7 years ago

@keean wrote:

Yes it does, what you wrote cannot ever end in a Nil.

Yup. Your idea was fundamentally flawed. We can't express a generic List type as typeclass interface. You'd have to implement the List for every possible data type you can put into a List which is the antithesis of a generic List. I was responding that yours was incorrect and I was demonstrating that if I try to write it correctly, I can't.

That is why I had written (before you commented with the erroneous idea) the correct way to define a generic List as follows.

@shelby3 wrote:

In ZenScript perhaps:

data List<A> = Nil | Cons(A, List<A>)

Or perhaps:

interface List<A>
singleton Nil implements List<Never>  // Never is¹ the bottom ⊥ type
sealed class Cons<A>(head: A, tail: List<A>) implements List<A>

@keean wrote:

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List to track the type of the elements stored in the list.

I guess I was wrong, a multi parameter type class can represent an arbitrary relation on types. I must have been sleepy when I agreed :-)

You weren't wrong the first time. It makes no sense to specialize the List on every data type we can add to the List.

keean commented 7 years ago

Sorry you are wrong here. You can express a list as a type class and I have done it in Haskell and Rust. The HList paper I co-wrote with Oleg Kiselyov makes extensive use of this.

shelby3 commented 7 years ago

@keean wrote:

In C, C++ and Rust we would write:

struct Cons { 
    head : Int, // this syntax is different for C
}

Seems I recall that in the early days of C, it was only possible to use typedef to give a name to a struct.

I forget about struct because I rarely code in C any more (and C++ I haven't touched since I stopped coding CoolPage in 2002). And when I think about struct from C, I don't think in terms of a language with objects and high-order typing concepts, since Java, Scala don't have struct. So I guess that is why I didn't relate it. And afaik, { ... } in JavaScript is not tagged with a name, e.g. Cons.

shelby3 commented 7 years ago

@keean wrote:

Sorry you are wrong here. You can express a list as a type class and I have done it in Haskell and Rust. The HList paper I co-wrote with Oleg Kiselyov makes extensive use of this.

I will need to review this, so I can comment meaningfully. Where may I read the most succinct example which shows how I won't have to specialize the typeclass list for every data type I want to put into the list?

I presume a lot of HList boilerplate again?

Edit: I suppose the point I am making is that we are trying to eliminate boilerplate for ZenScript. If you are expecting mainstream programmers to use HList, I doubt it. But I need to review the examples before I can comment not just from guessing. The link above is I think probably instructive about this.

keean commented 7 years ago

In Haskell this:

data Nil = Nil
data Cons a b = Cons a b
class List a
instance List Nil
instance (List b) => List (Cons a b)

We can look at Peano numbers as another example:

data Z = Z
data S x = S x
class Nat n
instance Nat Z
instance (Nat a) => Nat (S a)

Note the main difference between Haskell type classes an rust traits syntactically is a rust trait has a concept of 'self' but Haskell does not. You can liken this to function syntax:

x.f(y) // object special syntax like Rust
f(x, y) // all parameters equal, better for multiple dispatch

Likewise with type classes rust makes the first type parameter special, so the Peano numbers above become:

struct Z {}
struct<A> S (A) // Rust tuple syntax
trait Nat // note no type parameter
impl Nat for Z
impl<A : Nat> Nat for S<A>
shelby3 commented 7 years ago

@shelby wrote:

You weren't wrong the first time. It makes no sense to specialize the List on every data type we can add to the List.

.

I presume a lot of HList boilerplate again?

Edit: I suppose the point I am making is that we are trying to eliminate boilerplate for ZenScript.

I expect you are taking what should be an orthogonal concept of a generic list and binding it to the data type in the list, and then using some boilerplate scaffolding to simulate genericity? This appears to be the basic theme of HList concepts as far as I can discern thus far (I may be wrong?), to sidestep a weakness in the type system and simulate type system in code with scaffolding?

shelby3 commented 7 years ago

I had started to sniff a problem yesterday. I was starting to realize we probably have an unsolved problem in the design incorporating first-class anonymous structural unions.

@keean wrote:

In Haskell this:

data Nil = Nil
data Cons a b = Cons a b
class List a
instance List Nil
instance (List b) => List (Cons a b)

We need to remember that Haskell does not allow heterogeneous unions, because I've read that at least it would break the global inference of Haskell.

Thus afaik in the above b will also be the same as a | Nothing which is just a where Nothing is at the top of all types (because Haskell's call-by-name type system is inverted so we use Bottom type where we would use Top type in a call-by-value language1).

So afaics, that is not specializing the List typeclass for every data type a that can be put into the List because only one homogeneous type can be put into any list object due to Haskell's type system restrictions (lack of a first-class anonymous structural union type). There will ever be only two implementations (aka instances) of List: Nil and List (Cons a b) where b is (List a) | Nil and Nil is List Nothing.

I presume the same for Rust, but bottom type instead of top.

But for ZenScript we are proposing to support heterogeneous lists, so I am trying figure out now what changes and what the problems and challenges are. I am thinking we will need higher-kinded types and there may be other problems.

I suppose you are saying we can simulate heterogeneous lists with HList concepts, but the point of the first-class union was to eliminate that boilerplate and make everything more extensible as I had attempted to explain/discuss at the Rust forum:

I presume a lot of HList boilerplate again?

It is possible you didn't realize how extensively I wanted to use the first-class unions. Perhaps you were thinking we'd be using HList concepts instead?

We have design quagmire now. I am trying to get my mind wrapped around it. I am suspecting we have failure now in my concept, but I need to confirm by understanding this design quagmire fully.


  1. Something I published at the now defunct copute.com in 2011 when I was teaching myself some type theory (can still be found on archive.org):

    Inductive and coinductive types are categorical duals (if they produce the same objects in reversed partial-order), because inductive and coinductive category morphism functions have reversed directions[8]. The initial fixedpoint must be the least in the partial-order, thus inductive types have objects which are the output of a unique morphism function (i.e. the algebra recursively) that inputs the initial fixedpoint. Dually, the final object must be the greatest in the partial-order, thus the coinductive types have objects which are the side-effect "output" of a unique morphism function (i.e. the coalgebra recursively), which terminates with the final object when an object of the type is destructed.

    Since Monad and Comonad are categorical duals, they compose on outputs or inputs respectively.

    [8] Declarative Continuations and Categorical Duality, Filinski, section 1.3.1 Basic definitions.

shelby3 commented 7 years ago

Even more edits to my prior comment. I am suspecting potential failure of my design concept. :hurtrealbad: :sob:

shelby3 commented 7 years ago

@keean wrote:

We can look at Peano numbers as another example:

Haskell doesn't have any inductive types, thus it doesn't have (the type of) Peano numbers.

We have to be careful when using Haskell's coinductive call-by-name type system where laziness-by-default makes non-termination a value, as a model for an inductive type system with eager evaluation strategy by default that adds first-class unions. Many aspects appear to change.

keean commented 7 years ago

A couple of things.

First I think we should support multi-parameter type classes (with type-families, aka associated types) in full. People do not have to use the full power of this, but I don't want a false ceiling to the abstractions we can build.

Haskell has iso-recursive types (not equi-recursive) so it does have a kind of inductive type, thus Peano numbers work in Haskell :-) I can go on to define addition, subtraction, full arithmetic in the type system. Using polymorphic recursion you can even convert a value to a type for faked dependent types, but I don't think we should support this... That why we are using parametric types not universally quantified types.

So in our system those Haskell type Peano numbers have to be statically determined (effectively known at compile time). They cannot support runtime polymorphism without combining with existential types.

We are discussing different ways to do things, first-class union types give us:

type List<a> // a patial declaration which we need to tie the knot
data Nil = Nil
data Cons<a> = Cons(head: a, tail: List<a>)
type List<a> = Nil | Cons<a> // tie the knot, note the RHS are type in a 'type' declaration
keean commented 7 years ago

Regarding lazyness, nothing above changes with regard to the type system, all the Haskell types can be annotated with strictness annotations to make them strict. The type system has to be evaluated at compile time. Non termination of typing means the program won't compile, it has nothing to do with lazyness at runtime. All the examples I have given work fine in Rust which is eager.

shelby3 commented 7 years ago

@keean wrote:

all the Haskell types can be annotated with strictness annotations to make them strict.

Stictness annotations don't remove Bottom populated in the type because it is a fundamental fact that non-termination is a value in a lazy language. Bottom is a value (and populated in every type) in Haskell. Whereas, in an inductive language, Bottom is never instantiated.

keean commented 7 years ago

@shelby3 wrote:

Stictness annotations don't remove bottom populated in the type because it is a fundamental fact that non-termination is a value in a lazy language.

It doesn't matter. The types are valid in both lazy and eager languages. In Haskell every type contains 'bottom', and in Rust and eager languages they do not. Everything else is the same.

The other point is I think it is worth avoiding objects all together, the recursive typing from 'Self' makes things really complicated to understand (see Scala). I think if you have records ('C' structs') and type classes, you wont miss objects at all.

Edit: Of course if we decide to have objects that's fine, but I would like to make the case for not having them, which I think works better with multi-parameter dispatch. Modules take over the namespacing part of objects. New issue here: https://github.com/keean/zenscript/issues/9

shelby3 commented 7 years ago

@keean wrote:

It doesn't matter.

Haskell doesn't have disjunctive coproducts aka categorical sums.

I wrote at my copute.com circa 2011:

Trade-offs

CBV and CBN are categorical duels[10] (see also), because they have reversed evaluation order, i.e. whether the outer or inner functions respectively are evaluated first. Imagine an upside-down tree, then CBV evaluates from function tree branch tips up the branch hierarchy to the top-level function trunk; whereas, CBN evaluates from the trunk down to the branch tips. CBV doesn't have conjunctive products ("and", a/k/a categorical "products") and CBN doesn't have disjunctive coproducts ("or", a/k/a categorical "sums")[9].

↑ Non-termination

At compile-time, functions can't be guaranteed to terminate.

↑ Eager With CBV but not CBN, for the conjunction of Head "and" Tail, if either Head or Tail doesn't terminate, then respectively either List( Head(), Tail() ).tail == Tail() or List( Head(), Tail() ).head == Head() is not true because the left-side doesn't, and right-side does, terminate.

Whereas, with CBN both sides terminate. Thus CBV is too eager with conjunctive products, and non-terminates (including runtime exceptions) in those cases where it isn't necessary.

↑ Lazy With CBN but not CBV, for the disjunction of 1 "or" 2, if f doesn't terminate, then List( f ? 1 : 2, 3 ).tail == (f ? List( 1, 3 ) : List( 2, 3 )).tail is not true because the left-side does, and right-side doesn't, terminate.

Whereas, with CBV both sides non-terminate so the equality test is never reached. Thus CBN is too lazy with disjunctive coproducts, and in those cases non-terminates (including runtime exceptions) after doing more work than CPV would have.

[9] Declarative Continuations and Categorical Duality, Filinski, sections 2.2.1 Products and coproducts, 2.2.2 Terminal and initial objects, 2.5.2 CBV with lazy products, and 2.5.3 CBN with eager coproducts.

[10] Declarative Continuations and Categorical Duality, Filinski, sections 2.5.4 A comparison of CBV and CBN, and 3.6.1 CBV and CBN in the SCL.

keean commented 7 years ago

So we were supposed to be discussing subtyping here. If we don't stick to topic it will make it hard to find discussions in the future. I think we can discuss type-class syntax elsewhere.

I definitely want eager evaluation. Pervasive lazyness is a big pessimisation. Some kind of co-routines or yield would provide stream like functionality.

What are our conclusions about subtyping?

shelby3 commented 7 years ago

How we will implement a heterogeneous list that has an element type of a first-class anonymous structural union?

If I want to do some operations on this and not erase the union type, how will this be written in code? While still retaining the solution to the Expression Problem.

The answer seems to impact how I will think about how subtyping interacts with our unions and typeclasses.

keean commented 7 years ago
data List = Nil() | Cons(Int | String, List)

However, this is using | to union types and create sum types, which I think is confusing. Some alternatives with distinction between the two:

data List = Nil() | Cons(Int \/ String, List)

or

data List = Nil() + Cons(Int | String, List) 

Of course if its parametric on element:

data List<a> = Nil() + Cons(a, List<a>)

and that can be instantiated with a union type.

shelby3 commented 7 years ago

@keean wrote:

data List = Nil() | Cons(Int | String, List)

Not generic. Breaks other externalities dealing with extension.

data List<a> = Nil() + Cons(a, List<a>)

Explain how to use this per other requirements I stated.

@shelby3 wrote:

If I want to do some operations on this and not erase the union type, how will this be written in code? While still retaining the solution to the Expression Problem.

keean commented 7 years ago

Another possibility:

data List = Nil() | Cons([Int, String], List)
shelby3 commented 7 years ago

@keean wrote

data List = Nil() | Cons([Int, String], List)

As you requested not to do, are we going to continue mixing off-topic choice-of-preferred-syntax discussions in a conceptual Issue #8 about interaction of type system features?

keean commented 7 years ago

As you requested not to do, are we going to continue mixing syntax discussions in a conceptual Issue about interaction of type system features?

I think its better to try and stick to subtyping here... as long as we both understand the notation we are using.

In some regards we can imagine some boiler plate like this:

data UnionIntFloatString = I(Int) | F(Float) | S(String) 

Really we just want to allow the type system to infer the above for various unions automatically. 'I', 'F' and 'S' are the runtime type tags that you can case match on.

shelby3 commented 7 years ago

@shelby3 wrote:

data List<a> = Nil() + Cons(a, List<a>)

Explain how to use this per other requirements I stated.

@shelby3 wrote:

If I want to do some operations on this and not erase the union type, how will this be written in code? While still retaining the solution to the Expression Problem.

Btw, I think the solution is going to require higher-kinds. I will open a new Issue on Higher-kinds when I return from jogging. I already composed some of the OP for that new Issue.

keean commented 7 years ago

No need for higher kinds yet.

data List<a> = Nil() + Cons(head: a, tail: List<a>) // using + for sum types

append_it = (list, x) =>
    Cons(head: if x then "ABC" else 123, tail: list)

print_it = (list) => 
    typematch list.head:
        String(s) -> print_string(s)
        Int(i) -> print_int(i)
    if list.tail /= Nil():
        print_it(list.tail)

Inferred types

append_it(list : List<String | Int>, x : Bool) : List<String | Int>
print_it(list : List<String | Int>)

Inferring the type of Nil() is tricky, but not impossible, it effectively is typed as List<a> where a is a floating type variable (un-grounded). Later in the program a needs to get grounded at some point, as to have an un-grounded type variable in a program is an type-checking error.

shelby3 commented 7 years ago

Generic sort typeclass which works for any container type that implements it?

keean commented 7 years ago
quicksort<A>(c : A, lo : ValueType<A>, hi : ValueType<A>)
        where Cmp<ValueType<A>>, IndexedIterator<A> =>
    if lo < hi then
        p = partition(c, lo, hi)
        quicksort(c, lo, p)
        quicksort(c, p + 1, hi)

partition<A>(c : A, lo : ValueType<A>, hi : ValueType<A>)
        where Cmp<ValueType<A>>, IndexedIterator<A> =>
    pivot := c[lo]
    i = lo – 1
    j = hi + 1
    while true:
        do: 
            i := i + 1
        while c[i] < pivot

        do:
            j := j – 1
        while c[j] > pivot

        if i >= j :
            return j

        swap(c[i], c[j])

Note the associated types ValueType<A> I am not certain of the syntax, but a value type is like an 'output' type from a type-class which is defined by a particular instance.

The key thing is the comparison operators, which would be type-class operators. Something like:

trait Cmp<A>:
    `<`<A>(x : A, y : A) : Bool
    `>`<A>(x : A, y : A) : Bool

This would have to be defined for whatever the contents of the list were, so we would need some kind of instance like:

impl Cmp<Int | String>:
    ...

We need an implementation for Int | String as we need to define the relative ordering of both types in one dimension.

keean commented 7 years ago

As a follow up, "Elements of Programming" gives the following type for sort_n (page 207) which is a little better thought through than mine:

sort_n<I, R>(f : I, n : DistanceType<I>, r : R)
        where Mutable<I>, ForwardIterator<I>, Relation<R>, ValueType<I> == Domain<R> =>

Which allows the comparison function r to be passed into sort.

This requires a type-equality operator, which is in effect an infix type-class, and could be written TypeEq<ValueType<I>, Domain<R>> the definition is straightforward:

trait TypeEq<X, Y>
impl<X> TypeEq<X, X>
shelby3 commented 7 years ago

Ah, what you first wrote 2 hours ago (with A<B> which you've replaced) was very incomplete, i.e. you didn't define partition nor the < operator on the generic types. So I reloaded the page and seen you've gone off on a similar thought process that I've been going through. I will review your code now. I was off on a music tangent for past 2 hours. I spontaneously needed a morsel-sized respite from the compsci stuff.

Indeed it brings back to focus the discussions we had at the Rust forum about iterators and whether higher-kinds (or just a self type) are needed.

shelby3 commented 7 years ago

I don't see where swap(c[i], c[j]) is input? Also this is an invariant List (required by subsuming heterogeneous unions remember) so you can't implement a swap, because if it were possible to mutate it would break other references into various Cons in the single-linked-list.

Soon you will discover why I said I think we need higher-kinds. Hint: we need a factory.

keean commented 7 years ago

@shelby3 wrote:

I don't see where swap(c[i], c[j]) is input

swap is a generic function that swaps to values. In this case array lookup returns a reference, to swap is exchanging two values of whatever type the collection is. It is a top level function defined like this:

swap<A>(x : A, y : A) where Mutable<A> =>
    tmp = read(x)
    write(x, y)
    write(y, tmp)

Note it is defined on Mutable. In my code above just consider IndexedIterator extends Mutable for simplicity.

For a heterogeneous array the values would both be of type String | Int for example (every value in the array must have the same type, but that type can be a union).

There are no singly linked lists in my code, an IndexedIterator implies the values in the container are individually addressable by an index.

However it would not make any difference if it was a linked list, as the whole list is constrained by a single type bound:

data List<a> = Nil() + Cons(head: a, tail: List<a>)

This list is monomorphic, every element has type List<a> so there is no problem swapping values because they all have type A.

If you want to know whether any given value is Int or String you would have to typematch on the type.

shelby3 commented 7 years ago

@keean wrote:

For a heterogeneous array

Can't have subsuming hetergeneous unions on a variant data structure. I had mentioned thus numerous times in my explanation of my solution to the Expression Problem, but I hadn't emphasized the "subsumption" aspect until now (although I did mention it in passing and also during our dicussions on the Rust forum).

I presume you are forgetting that with a List we can add new types to the disjunction (aka union) at the head which are not in the tail of the list, which the array data structure can't allow (we can't mutate the type of an instance of an array, but we can mutate the type of a new Cons head which I had explained to you in some comment over the past days). I have all the design concepts of my union and Expression Problem solution loaded up in my head, so I aware of all these factors.

I raised the challenge of a generic implementation that by implication will work on invariant lists (given our unions would require them), which afaics your code above does not accomplish.

I am thinking we will need a factory to accomplish it, thus higher-kinds for the genericity. I am thinking of a Monoid typeclass.

keean commented 7 years ago

Well, you can have a heterogeneous list, and you can swap the element, as all the elements have the same type, it clearly states this in the type definition:

data List<a> = Nil() + Cons(head: a, tail: List<a>)

Note how the tail has type List<a> the same as the left-hand-side. So the a is the same everywhere in the list.

The a can be a union type, and if you append any type to the list that type would be part of the single union type for the whole list. The compiler would have to analyse to code and gather all the possible types that could be put in the list and make a the union of all types that get added to the list.

That's what the type signature says. If you want different behaviour, you will need a different datatype.

shelby3 commented 7 years ago

@keean wrote:

as all the elements have the same type, it clearly states this in the type definition:

data List<a> = Nil() + Cons(head: a, tail: List<a>)

No it doesn't unless you presume Haskell's lack of subsumption.

I already had twice provided an example of augmenting the union type for the head when we construct a Cons.

@shelby3 wrote:

The type of the type parameter A in Cons<A> will be subsumed to the GLB of the union of two types used to construct a Cons. I had already explained that as follows.

@shelby3 wrote:

If we use this syntax in ZenScript then when instantiating a Cons("string", Cons(1, Nil)) then the type will be inferred List<Number|String>. And if we instantiate first a let x = Cons(1, Nil) its type will be inferred List<Number>. And if we then instantiate Cons("string", x) then the type will be inferred List<Number|String>.

Note in the above quoted text, I was referring to a data type List not a typeclass List. Refer to that quoted comment for the declaration I employed there (which differs from the List in this comment).

Note Number is a subtype of Number|String.

You are apparently still thinking in terms of Haskell and its inability to subsume to a first-class union, which doesn't apply as I had already explained upthread.

@shelby3 wrote:

@keean wrote:

In Haskell this:

data Nil = Nil
data Cons a b = Cons a b
class List a
instance List Nil
instance (List b) => List (Cons a b)

We need to remember that Haskell does not allow heterogeneous unions, because I've read that at least it would break the global inference of Haskell.

Thus afaik in the above b will also be the same as a | Nothing which is just a where Nothing is at the top of all types (because Haskell's call-by-name type system is inverted so we use Bottom type where we would use Top type in a call-by-value language1).

So afaics, that is not specializing the List typeclass for every data type a that can be put into the List because only one homogeneous type can be put into any list object due to Haskell's type system restrictions (lack of a first-class anonymous structural union type). There will ever be only two implementations (aka instances) of List: Nil and List (Cons a b) where b is (List a) | Nil and Nil is List Nothing.

I presume the same for Rust, but bottom type instead of top.

But for ZenScript we are proposing to support heterogeneous lists, so I am trying figure out now what changes and what the problems and challenges are. I am thinking we will need higher-kinded types and there may be other problems.

To clarify (and correct) the quoted text, Haskell doesn't allow subsumption to a common supertype (aka GLB). There is no subtyping in Haskell, the entire type system is upside down coinductive. Whereas, ZenScript is proposing to support subsumption to the supertype union. And this means that attempting to append an element with a different type requires data structures are invariant, which means array will not type check. But a list will. Btw, I had explained all of this to you at the Rust forum, and I am remembering now. But it hadn't clicked for you yet, so I was probably speaking gibberish or Klingon.

shelby3 commented 7 years ago

I think when I speak of subsumption and subtyping, you tended to think it was irrelevant (or by not understanding it didn't register), as you admitted to me.

@keean wrote:

I guess I don't understand your solution because you are explaining it using reference to subtyping, like greatest common bound. With parametric types, and polymorphic functions you don't have any of this.

This is what I find confusing, you keep referring to the way languages like Scala handle classes, inheritance, subtyping, and subsumption, whereas I am talking about a type and type class based approach.

So that is probably why some things I wrote didn't register.

It is understandable, because you were approaching this from the presumption of Haskell's lack of subtyping.

Meta: apology if any of my words are coming across as acrimonious, disrespectful, or anything like that. I am trying to rectify it. I am very happy that you are taking on this project. I don't know why my words come out that way, I mean that I can't always word in a way that comes across as building great teamwork. I am wound up in a high amount of "type A" hypertension and worry. I really want to get this perfect. And I am worried (about many things, not just this language, but also this language). I realize it is very easy to fail with design of a programming language. I've had some successes in my life and string of failures lately. I don't want to fail.

I'm under extreme time pressure.

keean commented 7 years ago
data List<a> = Nil() + Cons(head: a, tail: List<a>)

because the a is a type parameter it has to be _exactly the same on both sides. If you want subsumption it has to be expressed as a type constraint. For example I think you want:

data List<a> = Nil() + Cons(head: a, tail: List<b>)

Now the tail list and the new list dont have to contain the same type, but the problem is b is not defined, so the above is not a valid type :-(

So to do what you want you have to use an HList type construct:

data Nil = Nil()
data Cons<A, B> = Cons(A, B)
trait List
impl List for Nil
impl List for Cons<A, B> where List<B>

We still need to work out the syntax for type constraints on trait implementations, but if we go with the above, this defines a list you can extend as you wanted, except there is no dynamic (runtime) polymorphism.

Now you can add types to the list as you wanted, except the list must be statically determined at compile time.