Open shelby3 opened 7 years ago
@keean wrote:
It will be interesting to see if we still need-trait objects
They are the only way (other than subclassing) I can see to get variance in the polymorphism, i.e. we can assign new types into the trait object (Rust's name for an "existential type" with a typeclass bound) without breaking the invariance of the Array
.
The advantage of trait objects compared to subclasses, is that types that can be inserted into a trait object don't have to conform to a pre-existing subclass hierarchy, i.e. it solves some aspect of the Expression Problem¹ but doesn't solve the scenario that unions solve. I am thinking we can view trait objects and unions as complementary in functionality offering different trade-offs on the multi-dimensional design space.
List
. Trait objects would also work with an Array
which is not an invariant data structure. Apology I am not employing academic terminology such as kinds, rank, F-bounded, etc.. Someone else can do a better job of translating my solution into that terminology.@shelby3 I am thinking along the same lines. As far as I understand it there is no subtyping. With type classes when we say trait B
extends trait A
we mean that any implementation of trait B
for a type requires there to also be an implementation of trait A
, but that is it.
This applies to trait-bounds (type constraints) and trait-objects. Trait-objects are not really a valid type-system concept, these are more correctly called existential types. In haskell if we have an existential data type like this:
data Singleton = forall a. (TraitB a) => Singleton a
The forall a
introduces a new scoped type variable, and because its scope is the container, we cannot ever from outside the container know the type of a
, but it is constrained to implement trait B
and that implies it also must implement trait A
. This means we can call any function from the interfaces trait A
and trait B
on the contents of the container. An alternative way of writing the above (not valid Haskell) is:
data Singleton = Singleton (exists a . (TraitB a) => a)
which is where the term 'existential type' comes from. Generally forall
corresponds to the logical concept of any (but may be none), and exists
corresponds to the logical concept of some (but at least one).
@keean wrote:
I didn't see any mention of type-classes when I looked at Ceylon. It also looks like Ceylon provides classical object inheritance and subtyping, which are both things ZenScript aims to avoid to keep the language small and simple.
Don't we need to be careful about differentiating between subclassing versus subtyping?
ZenScript will have subtyping because it will offer structural unions. An Int
is a subtype of an Int|String
. ZenScript will not have subclassing.
Ceylon has anonymous structural unions, but it doesn't have typeclasses. <truthful hyperbole>Also it has that Jurassic, ossifying, rigor mortise, anti-pattern subclassing paradigm, which will infect with that said virus any program written with Ceylon. Ditto every statically typed language that offers subclassing including Java and Scala (because programmers won't be disciplined enough to not use the subclassing virus in their design patterns).</truthful hyperbole> :stuck_out_tongue_winking_eye:
Update:
@naasking wrote:
Ceylon models Java's subtyping via first-class union and intersection types. It's not at all a classical subtyping model.
He replied as quoted above after I wrote the above. Please don't use the word 'subtyping' where you really mean 'subclassing'.
@naasking is apparently unaware of what the programmer can't accomplish with unions and intersections in terms of the Expression Problem if the language does not have the delayed binding of implementation to interface that typeclasses provide.
A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List<A>
to track the type of the elements stored in the list.
Thus we must use a data type to express this:
data List a = Nil | Cons a (List a)
Or add member property names if don't want to be forced to destructure with pattern matching:
data List a = Nil | Cons { head :: a, tail :: (List a) }
In ZenScript perhaps:
data List<A> = Nil | Cons(A, List<A>)
Or perhaps:
interface List<A>
singleton Nil implements List<Never> // Never is¹ the bottom ⊥ type
sealed class Cons<A>(head: A, tail: List<A>) implements List<A>
Note afaik, Haskell's data
type expresses a 'hasA' not an 'isA' subclassing relationship between Cons
and List
and a
can not be heterogeneous because Haskel doesn't have a first-class union type (their global inference doesn't allow for it). If we use this syntax in ZenScript then when instantiating a Cons("string", Cons(1, Nil))
then the type will be inferred List<Number|String>
. And if we instantiate first a let x = Cons(1, Nil)
its type will be inferred List<Number>
. And if we then instantiate Cons("string", x)
then the type will be inferred List<Number|String>
.
But note even though the above is subtyping, there are no virtual methods on data types, thus no virtual inheritance and no subclassing. Of course any function can input a data type, so that is synonymous with a method, except it isn't virtual dispatch.
Never
and not Nothing
(which appears to be the choice TypeScript made). Bottom is for example the type of functions that never return (terminate). Whereas, in a lazy, CBN language such as Haskell, Bottom becomes a value, so I argue it should be named Nothing
and not Never
.Or, with types
data Nil = Nil
data Cons<A, B> = Cons<A, B>
trait List A
impl List Nil
impl<A, B : List<A>> List Cons<A, B>
As a single type with multiple constructors (tagged union):
data List<A> = Nil | Cons<A, List<A>>
I'm still not sure about which keywords we should be using for trait, impl and data.
@keean wrote:
data Cons<A, B> = Cons<A, B>
data List<A> = Nil | Cons<A, List<A>>
That seems incorrect. It doesn't tell me how to construct an instance of Cons
. Mine was correct:
data List<A> = Nil | Cons(A, List<A>)
And we can add member property names if we don't want to be forced to employ pattern matching to destructure:
data List<A> = Nil | Cons(head: A, tail: List<A>)
@keean wrote:
data Nil = Nil data Cons<A, B> = Cons<A, B> trait List A impl List Nil impl<A, B : List<A>> List Cons<A, B>
That seems incorrect. I think it should be instead:
data Nil = Nil
data Cons<A> = Cons(A, List<A>) // Edit: `List` is a typo and should be `Cons` per subsequent discussion
trait List
impl List Nil
impl<A> List Cons<A>
Remember the trait List
should know nothing about A
if its methods don't specialize on A
.
@shelby3 wrote:
A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait
List<A>
to track the type of the elements stored in the list.
Note I'd prefer to write that:
pluggable List
implement Nil for List
implement Cons<A> for List // no need to write the type parameters twice per my
// proposal¹ that all UPPERCASE names are type parameters
Or much better:
pluggable List
Nil implements List
Cons<A> implements List
I think the last is best because it remains very similar to Java, yet we change the meaning of what is being implemented from interface
('isA' relationship) to pluggable
('hasA' relationship).
Thinking about a typeclass as a pluggable
interface seems very intuitive. We can't use Rust's trait
because trait
has a different meaning in several other languages.
Q: "What is a pluggable API?" A: "It means that you can replace the implementation."
Remember we both decided that clarity trumps brevity (e.g. implement
instead of impl
), especially for syntax which is not expression-level (because such declarations won't appear often because most frequently appear in source code are expressions).
Not quite :-) some things to discuss. You have given value constructors round brackets, that seems okay to me.
data List<A> = Nil | Cons(head: A, tail: List<A>)
Normally the arguments to cons are positional like function arguments, and deconstructed by pattern matching. You would use record syntax to name them, so either of the following:
data List<A> = Nil | Cons(A, List<A>)
data List<A> = Nil | Cons {head: A, tail: List<A>}
We don't have to stick to that but it's how I was thinking.
This has more problems:
data Nil = Nil
data Cons<A> = Cons(A, List<A>) // list is not a type
trait List
impl List Nil
impl<A> List Cons<A> // cons needs two type parameters
So correcting this:
data Nil = Nil
data Cons<A, B> = Cons(A, B) // constraints on data bad.
trait List<A>
impl List<A> for Nil
impl<B : List<A>> List<A> Cons<A, B>
Note this is still Rust syntax that gives special treatment to the first type class parameter, and I am not sure that is best, but let's have a different topic for that when we have agreed this.
@keean wrote:
You have given value constructors round brackets, that seems okay to me.
Yeah to differentiate them from type constructors, and because value constructors in the Java-like languages use round brackets (aka parenthetical grouping).
Normally the arguments to
Cons
are positional like function arguments, and deconstructed by pattern matching. You would use record syntax to name them, so either of the following:
You are repeating what I wrote. I even linked to the Haskell record syntax upthread.
However the following is not naming the members of Cons
(rather is only providing their positions and types), and can only be destructured with pattern matching as I already wrote in my prior comment:
data List<A> = Nil | Cons(A, List<A>)
And to stick with the Java-like syntax (and not mixing in Haskell syntax), I would prefer the following which I think will be much more clear to mainstream programmers coming from popular programming languages:
data List<A> = Nil | Cons(head: A, tail: List<A>)
The following is mixing a JavaScript unnamed Object
with some new concept of a tag of Cons
, which has no analogous concept to people using JavaScript or OOP languages (and our guiding principle is not to introduce unnecessary syntax, i.e. the new concept of a { ... }
where we don't need to):
data List<A> = Nil | Cons {head: A, tail: List<A>}
@keean wrote:
This has more problems:
data Nil = Nil data Cons<A> = Cons(A, List<A>) // list is not a type trait List impl List Nil impl<A> List Cons<A> // cons needs two type parameters
I had a typo and Cons
does not need two type parameters (two would mess up other things):
data Nil = Nil
data Cons<A> = Cons(A, Cons<A>)
trait List
impl List Nil
impl<A> List Cons<A>
The type of the type parameter A
in Cons<A>
will be subsumed to the GLB of the union of two types used to construct a Cons
. I had already explained that as follows.
@shelby3 wrote:
If we use this syntax in ZenScript then when instantiating a
Cons("string", Cons(1, Nil))
then the type will be inferredList<Number|String>
. And if we instantiate first alet x = Cons(1, Nil)
its type will be inferredList<Number>
. And if we then instantiateCons("string", x)
then the type will be inferredList<Number|String>
.
Note in the above quoted text, I was referring to a data type List
not a typeclass List
. Refer to that quoted comment for the declaration I employed there (which differs from the List
in this comment).
@keean wrote:
So correcting this:
That is still incorrect. You have a type parameter A
on typeclass List
which I already explained (and you even agreed!) is incorrect as follows.
@shelby3 wrote:
Remember the
trait List
should know nothing aboutA
if its methods don't specialize onA
.@shelby3 wrote:
A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait
List<A>
to track the type of the elements stored in the list.
Follow the link in the above quote to see where you had agreed. In fact, you were the one who explained the issue to me. And now it seems you forget what you explained to me.
Actually the above is revealing a deeper issue to me about higher-kinds which I had realized when I woke up this morning. I am preparing to write about that.
The {}
have the same use in 'C' for structs, C++ and Java for object definition so they are not new as such.
In C, C++ and Rust we would write:
struct Cons {
head : Int, // this syntax is different for C
}
Which we are writing:
data Cons = Cons {
head : Int
}
Or
data Cons = Cons (
head : Int
)
I am happy with either, providing the field names are optional, but I wanted to point out that the data statement can be viewed as an extension of struct and object definition.
@shelby3 wrote
I had a typo and Cons does not need two type parameters (two would mess up other things):
Yes it does, what you wrote cannot ever end in a Nil.
@shelby3 this version is correct in Rust syntax:
data Nil = Nil
data Cons<A, B> = Cons(A, B) // constraints on data bad.
trait List<A>
impl List<A> for Nil
impl<B : List<A>> List<A> Cons<A, B>
Cons needs a second type parameter because B
can either be another Cons
or a Nil
which are different types.
The trait List
needs a type parameter for the type that in the list, which is not the same as the type which is a member of the class (that is Nil
or Cons<A, B>
)
When we add Cons
to the List
type class we need to constrain B
to be in the List
type class so that you cannot put any random type as the second Cons
parameter.
@shelby3 wrote:
A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List to track the type of the elements stored in the list
I guess I was wrong, a multi parameter type class can represent an arbitrary relation on types. I must have been sleepy when I agreed :-)
@keean wrote:
Yes it does, what you wrote cannot ever end in a Nil.
Yup. Your idea was fundamentally flawed. We can't express a generic List
type as typeclass interface. You'd have to implement the List
for every possible data type you can put into a List
which is the antithesis of a generic List
. I was responding that yours was incorrect and I was demonstrating that if I try to write it correctly, I can't.
That is why I had written (before you commented with the erroneous idea) the correct way to define a generic List
as follows.
@shelby3 wrote:
In ZenScript perhaps:
data List<A> = Nil | Cons(A, List<A>)
Or perhaps:
interface List<A> singleton Nil implements List<Never> // Never is¹ the bottom ⊥ type sealed class Cons<A>(head: A, tail: List<A>) implements List<A>
@keean wrote:
A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass
trait List
to track the type of the elements stored in the list.I guess I was wrong, a multi parameter type class can represent an arbitrary relation on types. I must have been sleepy when I agreed :-)
You weren't wrong the first time. It makes no sense to specialize the List
on every data type we can add to the List
.
Sorry you are wrong here. You can express a list as a type class and I have done it in Haskell and Rust. The HList paper I co-wrote with Oleg Kiselyov makes extensive use of this.
@keean wrote:
In C, C++ and Rust we would write:
struct Cons { head : Int, // this syntax is different for C }
Seems I recall that in the early days of C, it was only possible to use typedef
to give a name to a struct
.
I forget about struct
because I rarely code in C any more (and C++ I haven't touched since I stopped coding CoolPage in 2002). And when I think about struct
from C, I don't think in terms of a language with objects and high-order typing concepts, since Java, Scala don't have struct
. So I guess that is why I didn't relate it. And afaik, { ... }
in JavaScript is not tagged with a name, e.g. Cons
.
@keean wrote:
Sorry you are wrong here. You can express a list as a type class and I have done it in Haskell and Rust. The HList paper I co-wrote with Oleg Kiselyov makes extensive use of this.
I will need to review this, so I can comment meaningfully. Where may I read the most succinct example which shows how I won't have to specialize the typeclass list for every data type I want to put into the list?
I presume a lot of HList boilerplate again?
Edit: I suppose the point I am making is that we are trying to eliminate boilerplate for ZenScript. If you are expecting mainstream programmers to use HList, I doubt it. But I need to review the examples before I can comment not just from guessing. The link above is I think probably instructive about this.
In Haskell this:
data Nil = Nil
data Cons a b = Cons a b
class List a
instance List Nil
instance (List b) => List (Cons a b)
We can look at Peano numbers as another example:
data Z = Z
data S x = S x
class Nat n
instance Nat Z
instance (Nat a) => Nat (S a)
Note the main difference between Haskell type classes an rust traits syntactically is a rust trait has a concept of 'self' but Haskell does not. You can liken this to function syntax:
x.f(y) // object special syntax like Rust
f(x, y) // all parameters equal, better for multiple dispatch
Likewise with type classes rust makes the first type parameter special, so the Peano numbers above become:
struct Z {}
struct<A> S (A) // Rust tuple syntax
trait Nat // note no type parameter
impl Nat for Z
impl<A : Nat> Nat for S<A>
@shelby wrote:
You weren't wrong the first time. It makes no sense to specialize the
List
on every data type we can add to theList
.
.
I presume a lot of HList boilerplate again?
Edit: I suppose the point I am making is that we are trying to eliminate boilerplate for ZenScript.
I expect you are taking what should be an orthogonal concept of a generic list and binding it to the data type in the list, and then using some boilerplate scaffolding to simulate genericity? This appears to be the basic theme of HList concepts as far as I can discern thus far (I may be wrong?), to sidestep a weakness in the type system and simulate type system in code with scaffolding?
I had started to sniff a problem yesterday. I was starting to realize we probably have an unsolved problem in the design incorporating first-class anonymous structural unions.
@keean wrote:
In Haskell this:
data Nil = Nil data Cons a b = Cons a b class List a instance List Nil instance (List b) => List (Cons a b)
We need to remember that Haskell does not allow heterogeneous unions, because I've read that at least it would break the global inference of Haskell.
Thus afaik in the above b
will also be the same as a | Nothing
which is just a
where Nothing
is at the top of all types (because Haskell's call-by-name type system is inverted so we use Bottom type where we would use Top type in a call-by-value language1).
So afaics, that is not specializing the List
typeclass for every data type a
that can be put into the List
because only one homogeneous type can be put into any list object due to Haskell's type system restrictions (lack of a first-class anonymous structural union type). There will ever be only two implementations (aka instance
s) of List
: Nil
and List (Cons a b)
where b
is (List a) | Nil
and Nil
is List Nothing
.
I presume the same for Rust, but bottom type instead of top.
But for ZenScript we are proposing to support heterogeneous lists, so I am trying figure out now what changes and what the problems and challenges are. I am thinking we will need higher-kinded types and there may be other problems.
I suppose you are saying we can simulate heterogeneous lists with HList concepts, but the point of the first-class union was to eliminate that boilerplate and make everything more extensible as I had attempted to explain/discuss at the Rust forum:
I presume a lot of HList boilerplate again?
It is possible you didn't realize how extensively I wanted to use the first-class unions. Perhaps you were thinking we'd be using HList concepts instead?
We have design quagmire now. I am trying to get my mind wrapped around it. I am suspecting we have failure now in my concept, but I need to confirm by understanding this design quagmire fully.
Something I published at the now defunct copute.com in 2011 when I was teaching myself some type theory (can still be found on archive.org):
Inductive and coinductive types are categorical duals (if they produce the same objects in reversed partial-order), because inductive and coinductive category morphism functions have reversed directions[8]. The initial fixedpoint must be the least in the partial-order, thus inductive types have objects which are the output of a unique morphism function (i.e. the algebra recursively) that inputs the initial fixedpoint. Dually, the final object must be the greatest in the partial-order, thus the coinductive types have objects which are the side-effect "output" of a unique morphism function (i.e. the coalgebra recursively), which terminates with the final object when an object of the type is destructed.
Since Monad and Comonad are categorical duals, they compose on outputs or inputs respectively.
[8] Declarative Continuations and Categorical Duality, Filinski, section 1.3.1 Basic definitions.
Even more edits to my prior comment. I am suspecting potential failure of my design concept. :hurtrealbad: :sob:
@keean wrote:
We can look at Peano numbers as another example:
Haskell doesn't have any inductive types, thus it doesn't have (the type of) Peano numbers.
We have to be careful when using Haskell's coinductive call-by-name type system where laziness-by-default makes non-termination a value, as a model for an inductive type system with eager evaluation strategy by default that adds first-class unions. Many aspects appear to change.
A couple of things.
First I think we should support multi-parameter type classes (with type-families, aka associated types) in full. People do not have to use the full power of this, but I don't want a false ceiling to the abstractions we can build.
Haskell has iso-recursive types (not equi-recursive) so it does have a kind of inductive type, thus Peano numbers work in Haskell :-) I can go on to define addition, subtraction, full arithmetic in the type system. Using polymorphic recursion you can even convert a value to a type for faked dependent types, but I don't think we should support this... That why we are using parametric types not universally quantified types.
So in our system those Haskell type Peano numbers have to be statically determined (effectively known at compile time). They cannot support runtime polymorphism without combining with existential types.
We are discussing different ways to do things, first-class union types give us:
type List<a> // a patial declaration which we need to tie the knot
data Nil = Nil
data Cons<a> = Cons(head: a, tail: List<a>)
type List<a> = Nil | Cons<a> // tie the knot, note the RHS are type in a 'type' declaration
Regarding lazyness, nothing above changes with regard to the type system, all the Haskell types can be annotated with strictness annotations to make them strict. The type system has to be evaluated at compile time. Non termination of typing means the program won't compile, it has nothing to do with lazyness at runtime. All the examples I have given work fine in Rust which is eager.
@keean wrote:
all the Haskell types can be annotated with strictness annotations to make them strict.
Stictness annotations don't remove Bottom populated in the type because it is a fundamental fact that non-termination is a value in a lazy language. Bottom is a value (and populated in every type) in Haskell. Whereas, in an inductive language, Bottom is never instantiated.
@shelby3 wrote:
Stictness annotations don't remove bottom populated in the type because it is a fundamental fact that non-termination is a value in a lazy language.
It doesn't matter. The types are valid in both lazy and eager languages. In Haskell every type contains 'bottom', and in Rust and eager languages they do not. Everything else is the same.
The other point is I think it is worth avoiding objects all together, the recursive typing from 'Self' makes things really complicated to understand (see Scala). I think if you have records ('C' structs') and type classes, you wont miss objects at all.
Edit: Of course if we decide to have objects that's fine, but I would like to make the case for not having them, which I think works better with multi-parameter dispatch. Modules take over the namespacing part of objects. New issue here: https://github.com/keean/zenscript/issues/9
@keean wrote:
It doesn't matter.
Haskell doesn't have disjunctive coproducts aka categorical sums.
I wrote at my copute.com circa 2011:
Trade-offs
CBV and CBN are categorical duels[10] (see also), because they have reversed evaluation order, i.e. whether the outer or inner functions respectively are evaluated first. Imagine an upside-down tree, then CBV evaluates from function tree branch tips up the branch hierarchy to the top-level function trunk; whereas, CBN evaluates from the trunk down to the branch tips. CBV doesn't have conjunctive products ("and", a/k/a categorical "products") and CBN doesn't have disjunctive coproducts ("or", a/k/a categorical "sums")[9].
↑ Non-termination
At compile-time, functions can't be guaranteed to terminate.
↑ Eager With CBV but not CBN, for the conjunction of
Head
"and"Tail
, if eitherHead
orTail
doesn't terminate, then respectively eitherList( Head(), Tail() ).tail == Tail()
orList( Head(), Tail() ).head == Head()
is not true because the left-side doesn't, and right-side does, terminate.Whereas, with CBN both sides terminate. Thus CBV is too eager with conjunctive products, and non-terminates (including runtime exceptions) in those cases where it isn't necessary.
↑ Lazy With CBN but not CBV, for the disjunction of
1
"or"2
, if f doesn't terminate, thenList( f ? 1 : 2, 3 ).tail == (f ? List( 1, 3 ) : List( 2, 3 )).tail
is not true because the left-side does, and right-side doesn't, terminate.Whereas, with CBV both sides non-terminate so the equality test is never reached. Thus CBN is too lazy with disjunctive coproducts, and in those cases non-terminates (including runtime exceptions) after doing more work than CPV would have.
[9] Declarative Continuations and Categorical Duality, Filinski, sections 2.2.1 Products and coproducts, 2.2.2 Terminal and initial objects, 2.5.2 CBV with lazy products, and 2.5.3 CBN with eager coproducts.
[10] Declarative Continuations and Categorical Duality, Filinski, sections 2.5.4 A comparison of CBV and CBN, and 3.6.1 CBV and CBN in the SCL.
So we were supposed to be discussing subtyping here. If we don't stick to topic it will make it hard to find discussions in the future. I think we can discuss type-class syntax elsewhere.
I definitely want eager evaluation. Pervasive lazyness is a big pessimisation. Some kind of co-routines or yield would provide stream like functionality.
What are our conclusions about subtyping?
How we will implement a heterogeneous list that has an element type of a first-class anonymous structural union?
If I want to do some operations on this and not erase the union type, how will this be written in code? While still retaining the solution to the Expression Problem.
The answer seems to impact how I will think about how subtyping interacts with our unions and typeclasses.
data List = Nil() | Cons(Int | String, List)
However, this is using | to union types and create sum types, which I think is confusing. Some alternatives with distinction between the two:
data List = Nil() | Cons(Int \/ String, List)
or
data List = Nil() + Cons(Int | String, List)
Of course if its parametric on element:
data List<a> = Nil() + Cons(a, List<a>)
and that can be instantiated with a union type.
@keean wrote:
data List = Nil() | Cons(Int | String, List)
Not generic. Breaks other externalities dealing with extension.
data List<a> = Nil() + Cons(a, List<a>)
Explain how to use this per other requirements I stated.
@shelby3 wrote:
If I want to do some operations on this and not erase the union type, how will this be written in code? While still retaining the solution to the Expression Problem.
Another possibility:
data List = Nil() | Cons([Int, String], List)
@keean wrote
data List = Nil() | Cons([Int, String], List)
As you requested not to do, are we going to continue mixing off-topic choice-of-preferred-syntax discussions in a conceptual Issue #8 about interaction of type system features?
As you requested not to do, are we going to continue mixing syntax discussions in a conceptual Issue about interaction of type system features?
I think its better to try and stick to subtyping here... as long as we both understand the notation we are using.
In some regards we can imagine some boiler plate like this:
data UnionIntFloatString = I(Int) | F(Float) | S(String)
Really we just want to allow the type system to infer the above for various unions automatically. 'I', 'F' and 'S' are the runtime type tags that you can case match on.
@shelby3 wrote:
data List<a> = Nil() + Cons(a, List<a>)
Explain how to use this per other requirements I stated.
@shelby3 wrote:
If I want to do some operations on this and not erase the union type, how will this be written in code? While still retaining the solution to the Expression Problem.
Btw, I think the solution is going to require higher-kinds. I will open a new Issue on Higher-kinds when I return from jogging. I already composed some of the OP for that new Issue.
No need for higher kinds yet.
data List<a> = Nil() + Cons(head: a, tail: List<a>) // using + for sum types
append_it = (list, x) =>
Cons(head: if x then "ABC" else 123, tail: list)
print_it = (list) =>
typematch list.head:
String(s) -> print_string(s)
Int(i) -> print_int(i)
if list.tail /= Nil():
print_it(list.tail)
Inferred types
append_it(list : List<String | Int>, x : Bool) : List<String | Int>
print_it(list : List<String | Int>)
Inferring the type of Nil()
is tricky, but not impossible, it effectively is typed as List<a>
where a
is a floating type variable (un-grounded). Later in the program a
needs to get grounded at some point, as to have an un-grounded type variable in a program is an type-checking error.
Generic sort typeclass which works for any container type that implements it?
quicksort<A>(c : A, lo : ValueType<A>, hi : ValueType<A>)
where Cmp<ValueType<A>>, IndexedIterator<A> =>
if lo < hi then
p = partition(c, lo, hi)
quicksort(c, lo, p)
quicksort(c, p + 1, hi)
partition<A>(c : A, lo : ValueType<A>, hi : ValueType<A>)
where Cmp<ValueType<A>>, IndexedIterator<A> =>
pivot := c[lo]
i = lo – 1
j = hi + 1
while true:
do:
i := i + 1
while c[i] < pivot
do:
j := j – 1
while c[j] > pivot
if i >= j :
return j
swap(c[i], c[j])
Note the associated types ValueType<A>
I am not certain of the syntax, but a value type is like an 'output' type from a type-class which is defined by a particular instance.
The key thing is the comparison operators, which would be type-class operators. Something like:
trait Cmp<A>:
`<`<A>(x : A, y : A) : Bool
`>`<A>(x : A, y : A) : Bool
This would have to be defined for whatever the contents of the list were, so we would need some kind of instance like:
impl Cmp<Int | String>:
...
We need an implementation for Int | String
as we need to define the relative ordering of both types in one dimension.
As a follow up, "Elements of Programming" gives the following type for sort_n (page 207) which is a little better thought through than mine:
sort_n<I, R>(f : I, n : DistanceType<I>, r : R)
where Mutable<I>, ForwardIterator<I>, Relation<R>, ValueType<I> == Domain<R> =>
Which allows the comparison function r
to be passed into sort.
This requires a type-equality operator, which is in effect an infix type-class, and could be written TypeEq<ValueType<I>, Domain<R>>
the definition is straightforward:
trait TypeEq<X, Y>
impl<X> TypeEq<X, X>
Ah, what you first wrote 2 hours ago (with A<B>
which you've replaced) was very incomplete, i.e. you didn't define partition
nor the <
operator on the generic types. So I reloaded the page and seen you've gone off on a similar thought process that I've been going through. I will review your code now. I was off on a music tangent for past 2 hours. I spontaneously needed a morsel-sized respite from the compsci stuff.
Indeed it brings back to focus the discussions we had at the Rust forum about iterators and whether higher-kinds (or just a self type) are needed.
I don't see where swap(c[i], c[j])
is input? Also this is an invariant List
(required by subsuming heterogeneous unions remember) so you can't implement a swap, because if it were possible to mutate it would break other references into various Cons
in the single-linked-list.
Soon you will discover why I said I think we need higher-kinds. Hint: we need a factory.
@shelby3 wrote:
I don't see where
swap(c[i], c[j])
is input
swap is a generic function that swaps to values. In this case array lookup returns a reference, to swap is exchanging two values of whatever type the collection is. It is a top level function defined like this:
swap<A>(x : A, y : A) where Mutable<A> =>
tmp = read(x)
write(x, y)
write(y, tmp)
Note it is defined on Mutable
. In my code above just consider IndexedIterator
extends Mutable
for simplicity.
For a heterogeneous array the values would both be of type String | Int
for example (every value in the array must have the same type, but that type can be a union).
There are no singly linked lists in my code, an IndexedIterator implies the values in the container are individually addressable by an index.
However it would not make any difference if it was a linked list, as the whole list is constrained by a single type bound:
data List<a> = Nil() + Cons(head: a, tail: List<a>)
This list is monomorphic, every element has type List<a>
so there is no problem swapping values because they all have type A
.
If you want to know whether any given value is Int or String you would have to typematch on the type.
@keean wrote:
For a heterogeneous array
Can't have subsuming hetergeneous unions on a variant data structure. I had mentioned thus numerous times in my explanation of my solution to the Expression Problem, but I hadn't emphasized the "subsumption" aspect until now (although I did mention it in passing and also during our dicussions on the Rust forum).
I presume you are forgetting that with a List
we can add new types to the disjunction (aka union) at the head which are not in the tail of the list, which the array data structure can't allow (we can't mutate the type of an instance of an array, but we can mutate the type of a new Cons
head which I had explained to you in some comment over the past days). I have all the design concepts of my union and Expression Problem solution loaded up in my head, so I aware of all these factors.
I raised the challenge of a generic implementation that by implication will work on invariant lists (given our unions would require them), which afaics your code above does not accomplish.
I am thinking we will need a factory to accomplish it, thus higher-kinds for the genericity. I am thinking of a Monoid typeclass.
Well, you can have a heterogeneous list, and you can swap the element, as all the elements have the same type, it clearly states this in the type definition:
data List<a> = Nil() + Cons(head: a, tail: List<a>)
Note how the tail has type List<a>
the same as the left-hand-side. So the a
is the same everywhere in the list.
The a
can be a union type, and if you append any type to the list that type would be part of the single union type for the whole list. The compiler would have to analyse to code and gather all the possible types that could be put in the list and make a
the union of all types that get added to the list.
That's what the type signature says. If you want different behaviour, you will need a different datatype.
@keean wrote:
as all the elements have the same type, it clearly states this in the type definition:
data List<a> = Nil() + Cons(head: a, tail: List<a>)
No it doesn't unless you presume Haskell's lack of subsumption.
I already had twice provided an example of augmenting the union type for the head
when we construct a Cons
.
@shelby3 wrote:
The type of the type parameter
A
inCons<A>
will be subsumed to the GLB of the union of two types used to construct aCons
. I had already explained that as follows.@shelby3 wrote:
If we use this syntax in ZenScript then when instantiating a
Cons("string", Cons(1, Nil))
then the type will be inferredList<Number|String>
. And if we instantiate first alet x = Cons(1, Nil)
its type will be inferredList<Number>
. And if we then instantiateCons("string", x)
then the type will be inferredList<Number|String>
.Note in the above quoted text, I was referring to a data type
List
not a typeclassList
. Refer to that quoted comment for the declaration I employed there (which differs from theList
in this comment).
Note Number
is a subtype of Number|String
.
You are apparently still thinking in terms of Haskell and its inability to subsume to a first-class union, which doesn't apply as I had already explained upthread.
@shelby3 wrote:
@keean wrote:
In Haskell this:
data Nil = Nil data Cons a b = Cons a b class List a instance List Nil instance (List b) => List (Cons a b)
We need to remember that Haskell does not allow heterogeneous unions, because I've read that at least it would break the global inference of Haskell.
Thus afaik in the above
b
will also be the same asa | Nothing
which is justa
whereNothing
is at the top of all types (because Haskell's call-by-name type system is inverted so we use Bottom type where we would use Top type in a call-by-value language1).So afaics, that is not specializing the
List
typeclass for every data typea
that can be put into theList
because only one homogeneous type can be put into any list object due to Haskell's type system restrictions (lack of a first-class anonymous structural union type). There will ever be only two implementations (akainstance
s) ofList
:Nil
andList (Cons a b)
whereb
is(List a) | Nil
andNil
isList Nothing
.I presume the same for Rust, but bottom type instead of top.
But for ZenScript we are proposing to support heterogeneous lists, so I am trying figure out now what changes and what the problems and challenges are. I am thinking we will need higher-kinded types and there may be other problems.
To clarify (and correct) the quoted text, Haskell doesn't allow subsumption to a common supertype (aka GLB). There is no subtyping in Haskell, the entire type system is upside down coinductive. Whereas, ZenScript is proposing to support subsumption to the supertype union. And this means that attempting to append an element with a different type requires data structures are invariant, which means array will not type check. But a list will. Btw, I had explained all of this to you at the Rust forum, and I am remembering now. But it hadn't clicked for you yet, so I was probably speaking gibberish or Klingon.
I think when I speak of subsumption and subtyping, you tended to think it was irrelevant (or by not understanding it didn't register), as you admitted to me.
@keean wrote:
I guess I don't understand your solution because you are explaining it using reference to subtyping, like greatest common bound. With parametric types, and polymorphic functions you don't have any of this.
This is what I find confusing, you keep referring to the way languages like Scala handle classes, inheritance, subtyping, and subsumption, whereas I am talking about a type and type class based approach.
So that is probably why some things I wrote didn't register.
It is understandable, because you were approaching this from the presumption of Haskell's lack of subtyping.
Meta: apology if any of my words are coming across as acrimonious, disrespectful, or anything like that. I am trying to rectify it. I am very happy that you are taking on this project. I don't know why my words come out that way, I mean that I can't always word in a way that comes across as building great teamwork. I am wound up in a high amount of "type A" hypertension and worry. I really want to get this perfect. And I am worried (about many things, not just this language, but also this language). I realize it is very easy to fail with design of a programming language. I've had some successes in my life and string of failures lately. I don't want to fail.
I'm under extreme time pressure.
data List<a> = Nil() + Cons(head: a, tail: List<a>)
because the a
is a type parameter it has to be _exactly the same on both sides. If you want subsumption it has to be expressed as a type constraint. For example I think you want:
data List<a> = Nil() + Cons(head: a, tail: List<b>)
Now the tail list and the new list dont have to contain the same type, but the problem is b
is not defined, so the above is not a valid type :-(
So to do what you want you have to use an HList type construct:
data Nil = Nil()
data Cons<A, B> = Cons(A, B)
trait List
impl List for Nil
impl List for Cons<A, B> where List<B>
We still need to work out the syntax for type constraints on trait implementations, but if we go with the above, this defines a list you can extend as you wanted, except there is no dynamic (runtime) polymorphism.
Now you can add types to the list as you wanted, except the list must be statically determined at compile time.
Union types express a subtyping relationship, but I am unclear as to whether typeclasses (i.e. Rust's traits) do?
If a
trait B
extends anothertrait A
andB
reuses the implementations ofA
, can we assign a trait object that has a boundB
to a trait object that has a boundA
?Seems the answer based on prior discussion is yes. But that is a subtyping relationship, which means we would need to deal with covariance on type parameters both when they are trait objects and when they are unions. Correct?
Prior discussion: https://github.com/keean/zenscript/issues/6#issuecomment-248711828 https://github.com/keean/zenscript/issues/1#issuecomment-248113585 https://github.com/keean/traitscript/issues/2#issuecomment-248021713 https://github.com/keean/zenscript/issues/1#issuecomment-248754649