keean / zenscript

A trait based language that compiles to JavaScript
MIT License
42 stars 7 forks source link

Orthogonal concepts #14

Open shelby3 opened 8 years ago

shelby3 commented 8 years ago

I am trying to nail down in my mind, the fundamental orthogonal semantic concepts of our proposed programming language.

Concept Description
data Unified sum, product, recursive, and record data types. No early bound operations other than construction and access. Record types may optionally be mutable.
typeclass Delayed binding of operations to types at the use-site. Contrast to OOP (subclassing) which binds operations at construction of objects. For polymorphic types, typeclass objects delays binding operations until construction of the typeclass object (instead of prematurely binding operations at the construction of an instance of the implementing type); whereas, my posited and proposed solution to the Expression Problem employing unions, in theory further delays binding to the use-site for polymorphic types. Note that delaying is a positive attribute in this context, because it increases degrees-of-freedom.
module Encapsulating data with compile-time (statically) bound operations and access control, enabling more sophisticated types.
monadic effect system https://github.com/keean/zenscript/issues/4#issuecomment-248569044

Thus I am wondering if module is the most apt name for the concept? I think of a module as a unit of code that is imported separately. We still need to import data and typeclass, so is this done by files? Isn't each file then a module? If so, what is the apt name for the concept of module above? I am thinking the above concept for module is really a class (without the subclassing inheritance anti-pattern). Can't modules extend other modules?

Edit: note also OCaml functors and modules.

Edit#2: on typeclasses, I replied to @keean:

Optional implicit parameters (which Scala has) provide a way of unifying runtime interfaces (passed by value) and monomorphisable interfaces (passed implicitly by type).

I like that summary (in the context of what I had written). We should make sure we frame that and use it in our documentation.

Edit#3: @shelby3 wrote:

I think it is important to remember that we want to keep implementation of interface (i.e. typeclass) orthogonal to instantiation of data, otherwise we end up with the early binding of subclassing. For example, in subclassing both a rectangle and a square would be subclasses of an interface which provides width and height. Yet a square only has one dimension, not two. Any interfaces which need to operate on the dimensions of rectange and square need to be customized for each, which is what typeclasses do. We will still have subtyping such as when an interface extends another interface then we can subsume to the supertype (which is why we need read-only references and for supersuming then write-only references). Yet references will not have the type of an interface if we don't provide typeclass objects, so then the only case of subtyping will be the subset relationships created by conjunctions and disjunctions of data types.

keean commented 8 years ago

Afaik I know that is limitation of Haskell.

No, it is a limitation of "type classes".

shelby3 commented 8 years ago

Feel free to explain why you think so. Otherwise I say BS.

keean commented 8 years ago

Feel free to explain why you think the syntax I wrote won't work.

I never said it would not work, but they are not type-classes.

keean commented 8 years ago

What you have is good old Java style OO interfaces with extension, and objects.

shelby3 commented 8 years ago

Lol, what does that mean? Show me such a definition in a canonical resource. The reason typeclasses in Haskell ended up with that limitation is because Haskell prefers global principle typings.

keean commented 8 years ago

Erm, type-classes have a widely known definition. Even Wikipedia seems vaguely accurate:

https://en.wikipedia.org/wiki/Type_class

shelby3 commented 8 years ago

Erm, type-classes have a widely known definition. Even Wikipedia seems vaguely accurate:

https://en.wikipedia.org/wiki/Type_class

Wikipedia copied Haskell. Again this is a limitation of Haskell's choice of priorities for global principle typings. Open your mind a bit.

keean commented 8 years ago

Wikipedia copied Haskell. Again this is a limitation of Haskell's choice of priorities for global principle typings. Open your mind a bit.

Not surprising seeing as Type Classes were "invented" to solve operator overloading in Haskell, see: http://homepages.inf.ed.ac.uk/wadler/papers/class/class.ps

keean commented 8 years ago

I don't see whats so difficult. Type Classes depend on "types", its in the name "Type Classes", not "Value Classes" or "Widget Classes" :-)

You want the selected function to depend on a value (yes you really do), so the datatype solution is the right one.

Just forget type-classes exist and tell me what is wrong with this:

data LessThan<A> = LessThan {
     lessThan(A, A) : Bool
}

let forwardOrder : LessThan<Int> = LessThan {
    lessThan(x, y) = x < y
}

let reverseOrder : LessThan<Int> = LessThan {
    lessThan(x, y) = x > y
}

f(0, 1, forwardOrder)
f(0, 1, reverseOrder)
shelby3 commented 8 years ago

What you have is good old Java style OO interfaces with extension, and objects.

Incorrect. A Java interface has to be bound to the data type when it is instantiated.

Where is this extension feature in Java? And even if you emulated with static methods, you don't get monomorphization.

shelby3 commented 8 years ago

Wikipedia copied Haskell. Again this is a limitation of Haskell's choice of priorities for global principle typings. Open your mind a bit.

Not surprising seeing as Type Classes were "invented" to solve operator overloading in Haskell, see: http://homepages.inf.ed.ac.uk/wadler/papers/class/class.ps

Not surprising has nothing to do with the fact that Haskell's limitation is not an inherent limitation of the concept of type classes.

keean commented 8 years ago

Okay, to avoid arguing definitions, lets call them "Shelby-Classes", then we can avoid the whole problem of you not getting what a type-class is.

keean commented 8 years ago

Although that's probably not necessary, as what you have written looks exactly like Java interfaces...

keean commented 8 years ago

Here's some C# that does this: https://msdn.microsoft.com/en-us//library/bb383977.aspx

shelby3 commented 8 years ago

@keean wrote:

Just forget type-classes exist and tell me what is wrong with this:

data LessThan<A> = LessThan {
     lessThan(A, A) : Bool
}

Playing with irrelevant names won't help you. Name lessThan to the semantics I intended which is relativeOrdering.

shelby3 commented 8 years ago

@keean wrote:

Okay, to avoid arguing definitions, lets call them "Shelby-Classes", then we can avoid the whole problem of you not getting what a type-class is.

You mean of you not getting the generality of the concept of a type class because you've known Haskell too closely for too long and can only think one way.

keean commented 8 years ago

Playing with irrelevant names won't help you. Name lessThan to the semantics I intended which is relativeOrdering.

Sorry I don't understand... what do you mean? Here are definitions of two different orderings:

let forwardOrder : LessThan<Int> = LessThan {
    lessThan(x, y) = x < y
}

let reverseOrder : LessThan<Int> = LessThan {
    lessThan(x, y) = x > y
}
keean commented 8 years ago

You mean of you not getting the generality of the concept of a type class.

Shall I put a post on LtU to resolve the matter?

shelby3 commented 8 years ago

@keean wrote:

Sorry I don't understand... what do you mean? Here are definitions of two different orderings:

let forwardOrder : RelativeOrdering = RelativeOrdering {
   relativeOrdering(x, y) = x < y
}

let reverseOrder : RelativeOrdering = RelativeOrdering {
   relativeOrdering(x, y) = x > y
}
shelby3 commented 8 years ago

@keean wrote:

Shall I put a post on LtU to resolve the matter?

Taking the discussion to where I am banned is going to insure I can't say anything. Excellent idea.

shelby3 commented 8 years ago

@keean wrote:

as what you have written looks exactly like Java interfaces...

Nope. I didn't bind the selection of the implementation to the data instances when they were constructed.

keean commented 8 years ago

regarding 'RelativeOrdering', what difference does changing the name of the datatype make?

keean commented 8 years ago

Nope. I didn't bind the selection of the implementation to the data instances when they were constructed.

Yes its more like C# extension methods, see the link I posted above.

shelby3 commented 8 years ago

@keean wrote:

what difference does changing the name of the datatype make?

Exactly. As for you conflating the interface with the data type, I didn't have that in my example. So if that was part of your point, then you don't have one.

keean commented 8 years ago

My point was the code does what you want, it lets you select the ordering produced by f.

shelby3 commented 8 years ago

@keean wrote:

Yes its more like C# extension methods

As well as Swift extension methods, which can simulate type classes. So they have type classes as others have figured out.

keean commented 8 years ago

Exactly. As for you conflating the interface with the data type, I didn't have that in my example. So if that was part of your point, then you don't have one.

The data type is the interface, the value is the implementation. Looks good to me :-)

shelby3 commented 8 years ago

@keean wrote:

My point was the code does what you want, it lets you select the ordering produced by f.

No it does not. It doesn't allow me to select the implementation delayed at the call site. We just had the discussion earlier today.

keean commented 8 years ago

No it does not. It doesn't allow me to select the implementation delayed at the call site. We just had the discussion earlier today.

But we also very slowly went through that if f does not create new implementations then setting the implementation at the call site of f is the same as setting is at the call site of lessThan

keean commented 8 years ago

So then when we have:

f(x) =>
    x.call_widget()

We can clearly see no code in f is inserting any implementations into anything. Sorry to have to take it this slowly, but I am feeling particularly stupid today. Are we agreed?

shelby3 commented 8 years ago

But we also very slowly went through that if f does not create new implementations then setting the implementation at the call site of f is the same as setting is at the call site of lessThan

That is irrelevant to what we are talking about now. You seem to get concepts all comingled in your mind.

Don't conflate the discussion in this thread with the discussion I linked to in other thread.

No it does not. It doesn't allow me to select the implementation delayed at the call site. We just had the discussion earlier today.

keean commented 8 years ago

Well I think its an important point. How do you respond? To just dismiss it as irrelevant is missing the point.

keean commented 8 years ago

And besides your code also determined the overload at the callsite of f:

f(0, 1: ForwardOrder)
f(0, 1: ReverseOrder)
shelby3 commented 8 years ago

@keean wrote:

And besides your code also determined the overload at the callsite of f

Yes that is the point. At the call site, not at the instantiation of the data types, which could have been far up the function hierarchy. I have 0, 1 as literals, but I am confident you can imagine them as references to instances.

keean commented 8 years ago

Well thats exactly what my code does? The overload is determined by the dictionary passed into f

shelby3 commented 8 years ago

@keean wrote:

How do you respond? To just dismiss it as irrelevant is missing the point.

How to respond to nonsense?

keean commented 8 years ago

How to respond to nonsense?

Well you can respond to the code. How does it not do what you want. Produce a counter example where it goes wrong?

shelby3 commented 8 years ago

@keean wrote:

Well thats exactly what my code does? The overload is determined by the dictionary passed into f

No because you can't add new methods after the data type instance was created. That is the entire point of the Expression Problem.

See the link below where I had told you that earlier:

Don't conflate the discussion in this thread with the discussion I linked to in other thread.

No it does not. It doesn't allow me to select the implementation delayed at the call site. We just had the discussion earlier today.

shelby3 commented 8 years ago

You take some time to think it over. You'll get it if you slow down and think. You must be overworked and too much multitasking.

keean commented 8 years ago

No because you can't add new methods after the data type instance was created. That is the entire point of the Expression Problem.

Of course you can the datatype is just the method dictionary. You are passing the type (this Int in this case) separately.

keean commented 8 years ago

The thing is we know all the methods that f is going to call, so what does adding new methods mean? Where do you want to add new methods?

You cant add new methods to type-classes either.

keean commented 8 years ago

You must be overworked and too much multitasking.

I must be because I just realised the reason I kept changing the compiler code but the output didn't change is because I was running the wrong command :-)

Anyway the important point to note is the datatype is modelling the type-class, not the integer. So you can pass as many dictionaries as you need. The lessThan datatype was implemented on the Int type parameter and you can have as many datatypes as you want implemented for Ints.

keean commented 8 years ago
data LessThan<A> = LessThan {
     lessThan(A, A) : Bool
}

let forwardOrder : LessThan<Int> = LessThan {
    lessThan(x, y) = x < y
}

let reverseOrder : LessThan<Int> = LessThan {
    lessThan(x, y) = x > y
}

data Show<A> = Show {
     show(A) : ()

let print = Show {
      show(x : Int) => print_int(x)
}

let write = Show {
      show(x : Int) => write_to_file(x)
}

f(0, 1, forwardOrder, print)
f(0, 1, reverseOrder, write)

f(x, y, order, where) =>
    where.show(if order.lessThan(x, y) then x else y)

So now we have extended Int with both LessThan and Show

skaller commented 8 years ago

I am getting a bit lost here so lets back up. The way to do this is not think about concepts, but think "what is the implementation and what does it achieve"?

If you associate a dictionary to an object x of type Int which contains field show and a method to convert int to string, you have some kind of object orientation. Original Haskell was like that.

However this is weak, because it cannot handle relationships. GHC and Felix do not do this, they allow a typeclass to be associated with multiple types. So for example you can have a class Addable with two (input) types, and instances for pairs of types.

This implies you have to pass the dictionary to the function containing the addition operation, it has nothing to do with the types of the objects being added. The function contains code which adds two values by dispatching to an empty slot which will crash, so it has to fill the slot in with the instance corresponding to the slot. The name "type class" is a misnomer, because originally in Haskell it was a class for a single type. Category is a better name.

Felix does this at compile time, but the machinery is the same. Fortress does it at run time, Fortress is a nominally typed (traditional) Object Oriented system with multi-methods which have multiple argument type classes, with the binding done at run time (amazing, specialisation/instance detection at run time).

In Felix there are two steps. First, when you see add (x,y) inside a function, you do lookup to try to find add for the types of x and y. (Felix has overloading remember). The result of that lookup is a type class virtual (unimplemented) method. That is the slot I mentioned above. Associated with the slot is a specific type class, Addable. So the data are required

(a) the name of the type class (b) the name of the method within that type class (c) a list of types corresponding to the type parameters of the type class

In Felix, this list of types is precisely those generated by unification of the types of the function arguments and the types of the type class virtual function arguments. These specify a lower bound on the types of an instance we need to find later. By lower bound, I mean we require an instance whose types are at least as general as specified.

The next thing Felix does is monomorphisation. This means that the list of types in the function specifying the type class are now concrete (not polymorphic).

Next, Felix fills in the slot by searching through all instances of the relevant type class. It first finds all the instances of that type class for which the instance bindings of the type class parameters are more general than the monomorphised types: the instances are still polymorphic. Then given the list of candidate instances, we try to find the unique most specialised one. There is an error if not existing or not unique. This is NOT a type checking error, it an instantiation error.

Now, if we find the unique instance, we have to specialise it to the concrete types we have. Then we find the corresponding method of the now monormorphised instance which is named add and put that in the slot. So finally we have a concrete add method.

Now you must note that when we monomorphised the instance, we got recursion. That is, to monomorphise the instance we have to monomorphise the signature of the methods, but we also have to monomorphise the method definitions! So the whole monomorphisation process recurses.

This recursion is crucial. It is what allows Felix to do second order polymorphism, Haskell style, and implement monads, for example. Felix top level functions use type-schema, that is, it is only head (Horne clause in Prolog) polymorphism, aka first order polymorphism. Felix does not support passing polymorphic functions around. HOWEVER type class instantiation DOES. The instantiation mechanism is recursive, which means you have polymorphic recursion. I can show you an example of this in the Felix library: polymorphic recursion is used to print tuples by calling show on the head of the tuple, and recursing to show the tail. In Felix, polymorphic recursion only works using the indirection type classes provide: you can't have polymorphic closures directly but you can have polymorphic type class methods.

If you think the above algorithm is complex .. well I used a high level language (Ocaml) to implement it, and Felix, being a whole program analyser, can fully instantiate all type classes at compile time (completely eliminating them prior to optimisation). So my "dictionaries" are in the compiler only.

In other words in Felix, the type class instantiation is done "after type checking" but "before code generation". It is not "early binding" but it is not "late binding either". C++ works the same way. There this machinery is called "two phase lookup". In C++ it really is lookup. The second phase of lookup looks in the scope of the call site to resolve generics, and type checking has to happen there because in C++ templates aren't type checked during phase 1 BECAUSE they have no type classes (called "concepts" in C++).

So when you have type classes you always have two stages of type checking. The first is your usual type checking, prior to monomorphisation. The second, during monomorphisation, is called instantiation, and just means checking for existence of instances. If you delay this too long you can get errors are run time. This is what dynamic languages do.

I realise my description of the machinery is not enough to actually implement anything. Its just a vague waffle describing how type classes (with multiple type arguments) actually work. AFAIK Felix algorithm is superior to GHC because GHC does not allow overlapping instances. Felix does. However to be fair Felix doesn't provide a termination guarantee, and GHC does. Wish I knew how to do that.

skaller commented 8 years ago

So now: modules work differently to type classes. With a module you specifically pass a particular module (dictionary) to your function. You can do this without all the complications above, because you have a named module, so you just name it and pass it in. The named module's type has to agree with the function's module type signature, this has to be type checked. Given that signature, you can just makes slots to hold the functions of the module to be passed in, and when you call add in your function it is just an indirect call via the slot. If you know the module when compiling you can optimise the slot away and call the method directly. Ocaml added this optimisation only recently. It also used to only allow the passing in at link time, but now you can do it with some restrictions at run time (first class modules).

The idea of type classes is that your generic notation systems are consistent globally. It is a way of extending the language in the library. For example you can define add in the library and it means addition universally .. even though it is not defined for all types yet. It means the same add for a new type that you design, by forcing you to provide an instance for your type (and any other types you want to add with it).

The point here is that the binding to the type class is done by the phase 1 type checker, without knowing about your new type. This means you can use add in a generic Euclid gcd algorithm and the algorithm is full type checked and will work even with your new type that is designed after the algorithm, provided you also provide suitable instances.

The reason for this machinery is that Euclid's gcd algorithm is not parametrically polymorphic, because parametric polymorphism is extremely weak. It only supports a very small set of universal operations, such as product (tuple) formation. Type classes allow you to extend the set of operations to include, say, addition, in limited contexts (namely, where you pass the class to a function requiring that extension). It uses a nasty trick: the type checking first assumes parametric polymorphism, that is, it assumes add works for all types. This is what the type class signatures promise.;

The promise is a lie. add does not work for all types. But you can cheat, and make it seem like it does by providing an instance for the particular cases you are actually using.

Modules/functors on the other hand do not cheat, you have to explicitly pass the required "extension" so the system remains fully parametric. So the difference is clear: with type classes the extension is passed implicitly which is why you can only sort integers in one direction. With modules you pass the instance in explicitly so you can have two orderings of integers because you have to actually say which one to use.

Shelby3's chosing by selecting instances by providing candidates in a particular context, and different instances in other contexts mixes both models, so the selection is partly implicit, and partly explicit. This is a bad idea because it makes it hard to reason about what is going on: when you look at a call site, you also have to look at the top of the file to see which instances are made available. The ability to reason about code is dependent on local knowledge. This is what referential transparency means and why purity and absence of side-effects is considered important in functional programming: you can look at a single line of code and know what a particular expression calculates without looking anywhere else. If the expression says sin(x) and you don't know what sin does you have a name you can then look for in the library. So the reasoning is not entirely local, but the non-local part of the reasoning is driven by something tangible you can refer to.

Similarly global variables in C are bad, because when you see a variable you have a name but you don't know which piece of code last assigned a value to it, you have to not only troll through the WHOLE program looking for assignments, you also have to figure out what the previous control flow might have been and calculate the possible values that the variable might contain. Its impossible! Not only can't a human do it, you can't really do it with a mechanical tool either.

So its is easier to reason with modules. They're better than type classes. Except for one problem: reasoning is easier because you explicitly name the module to be use but harder because your code is littered with detailed choices. If you have an "abstract concept" you want to make a whole set of choices at once, and call that a "context". And that is what type classes provide. You only get ONE choice for each concept, so in effect you are not programming, you are designing language extensions.

skaller commented 8 years ago

Ok so here is a partial implementation. In a function definition f you have passed in a type class Addable. This contains a method add. You now put add in your lookup context. Now you are binding the function code. You see add. You look for it, and find it in the context with a link to the add method of the type class dictionary for Addable. So you output the code Addable_parameter.add. Now, Addable_parameter is a parameter of the function. So at run time, you have called the add method of the instance argument of the typeclass Addable which has been passed in.

Since this is Javascript, the instance can contain more methods than just add, it can contain more than the type class Addable allows. You never call them so it doesn't matter inside the function.

So implementing type classes inside the function is easy, the problem now is how to pass a suitable (type checked) instance to the function. In other words when you call f (x,y) how do you know what to pass to f? Well you lookup f, and you see it requires a type class Addable with instance types, say int and float.

And there you start to have fun! How do you find Addable<int,float> instance? It's even harder if you require a non-monomorphic specialisation. It can be done as described earlier, but it is easier in Felix because there is no separate compilation to worry about, and so you never need any dictionaries at run time (the dictionaries only exist at compile time).

shelby3 commented 8 years ago

@shelby3 wrote:

f(0, 1: ForwardOrder)
f(0, 1: ReverseOrder)

P.S. I think your idea of enforcing one implementation per data type is rubbish, because then I can't do the above.

We will need the above casts on typeclass bounds any way. I prefer the inference engine to force me to cast the argument always (when there is a function overload on that argument, to avoid inference weirdness) which tells me which argument is causing the boilerplate, than the alternative of writing boilerplate function names which doesn't tell me which argument the choice is being made on.

shelby3 commented 8 years ago

@keean wrote:

Anyway the important point to note is the datatype is modelling the type-class, not the integer. So you can pass as many dictionaries as you need. The lessThan datatype was implemented on the Int type parameter and you can have as many datatypes as you want implemented for Ints.

Excuse me, yes that is correct. So in answer to your original question:

Just forget type-classes exist and tell me what is wrong with this:

f(0, 1, forwardOrder)
f(0, 1, reverseOrder)

What is wrong is you are forcing the caller to always specify the dictionary instead of optionally allowing the compiler to infer which dictionary to select. If the compiler instead understands the last argument to be optional and will infer it, then you essentially have typeclasses, except for the case where we want to select typeclasses in the where clause with type variables which are not the types of arguments.

Any way, none of that changes my claim that the general form of typeclasses should let me optionally select the implementation.

skaller commented 8 years ago

So @shelby3, you are arguing against type classes and for modules. I personally have no problem with that, they're easier to implement and more powerful: because modules can contain functions, functors with module parameters have both types and functions as parameters.

The only downside is that to pass them you have to name them, and you have to pass them explicitly every time. First class modules are powerful. With a JS implementation you may even be able to construct first class functors.

The main issue would be notation. Once you use modules, you have to refer to their functions with the module name prefixed, which makes infix operators, etc, difficult.

Take your pick ... or, if you want a language even more complex than Felix, do both :)

If you want implicit selection, you obviously get only one choice (type classes). If you want more than one choice you have to explicitly choose which one (modules).

shelby3 commented 8 years ago

@skaller wrote:

The only downside is that to pass them you have to name them, and you have to pass them explicitly every time

No I wrote "optional", which means when they aren't explicit then the dictionary will be selected by the compiler.

If you want implicit selection, you obviously get only one choice (type classes). If you want more than one choice you have to explicitly choose which one (modules).

There is another choice, which is explicit when you want to be and implicit when you don't want to be explicit. The import scoping and extends scoping helps you (see my example for using extends).