keean / zenscript

A trait based language that compiles to JavaScript
MIT License
42 stars 7 forks source link

OCaml's functors vs. typeclasses #22

Open shelby3 opened 7 years ago

shelby3 commented 7 years ago

@skaller wrote:

[There is an unrelated question to the syntax: do you REALLY want type classes? Ocaml style modules are more powerful but harder to use. Of course you can have both. Type classes are probably OK for scripting languages, that's why I chose them for Felix, but they have limitations.]

I wrote about this 6 days ago, and I think you are referring to the fact that OCaml's unique first-class functors recently have added implicit selection, so combined with modules this can apparently emulate typeclasses and more.

Could you explain about the limitations?

skaller commented 7 years ago

Actually no, I'm referring to something simpler: Consider a TotalOrder type class. So Integer is totally ordered right?

Which way? Up or down?

You see? In Ocaml it is no problem because you have

module IntegerUpOrder = ... module IntegerDownOrder = ...

but in Felix or Haskell you're screwed. You only get ONE bite at the cherry. Type classes are strictly weaker. They fix the semantics of some notation permanently and irrevocably. Another example from Felix which is very annoying: I had to define AdditiveGroup and MultiplicativeGroup. TWO type classes. One uses + and one uses * operator. But they're both groups with the same structure.

In Ocaml, no problem. One functor. Two instances. Because the instances are nominally typed.

Of course the nominal typing is a serious PITA which makes Ocaml functors really hard to use compared to say C++ templates or Felix type classes.

Roughly, a type class is a nominal type which provides structural typing for instances.

BTW: both mechanisms are pretty bad in the sense that neither supports composition very well. You just try to define a Ring out of two Groups in Ocaml!

How does maths do it? With context sensitive language. That is, DSSLs. You can't read any maths papers without first understanding the symbolic language used in that domain, and, often, in that particular paper!

shelby3 commented 7 years ago

@skaller wrote:

Haskell you're screwed. You only get ONE bite at the cherry. Type classes are strictly weaker. They fix the semantics of some notation permanently and irrevocably.

I had written about this before (citing Robert Harper from CMU) that in Haskell a typeclass can only be implemented in one way, but afaik this is because of Haskell's global type inference. Yet with local selection of implementations, then it is possible to have multiple variants of implementation for a typeclass for a particular data type. Are you referring to something else because I don't see a limitation being inherent to typeclasses or how OCaml has improved this?

I had to define AdditiveGroup and MultiplicativeGroup. TWO type classes. One uses + and one uses * operator. But they're both groups with the same structure.

Why not define a Group typeclass that takes a type parameter for its group operation?

Of course the nominal typing is a serious PITA which makes Ocaml functors really hard to use compared to say C++ templates or Felix type classes.

Roughly, a type class is a nominal type which provides structural typing for instances.

BTW: both mechanisms are pretty bad in the sense that neither supports composition very well. You just try to define a Ring out of two Groups in Ocaml!

I'm too lazy to try to map out what you mean. Could you elaborate a bit or show example?

skaller commented 7 years ago

FYI, AFAIK Haskell doesn't have type inference at all (it has kind inference though). In any case even if the instance bindings are local, you still have s problem if you want two distinct bindings for the same types in the same context.

The reason I can't define a Group typeclass that takes a type parameter for the group operation is that types are not functions. The group operation is a function. Felix typeclasses only allow type parameters, not function or value parameters.

Of course I could add function parameters. But guess what? That's just an Ocaml functor which takes a module argument, where modules include both types AND functions.

As to composition, its well understood that neither type classes nor modules are easy to combine. I once asked on the Ocaml list how to make a Ring out of an AddGroup and a MulGroup and it took the foremost experts a couple of days to get it right. The reason is that the types of each are distinct as written and so to make them the same type requires tricky syntax, which turns out to be ad-hoc and limited in what it can do.

In Felix (and I guess Haskell) you can derive a type class from one or more others, but such derivation is not combinatorial. What we want is combinators to systematically add features together and this is really only possible for type classes that have a single type, and functions that accept such type but return an invariant type. In other words, its just OO at the compiler level, with the same limitations (covariance problem).

In principle, we "know" how to combine things because category theory tells us. In practice the modelling is so complex no one can understand it. That's basically why people use dynamics, because the correct static modelling is just too hard.

So actually, the way to do this is part of the design goals of Zen: the compiler's type system has to be a full scale purely functional programming languages. That doesn't tell you how to implement the required combinators but it does at least ensure that when you figure it out, you have the tools to do it.

shelby3 commented 7 years ago

@skaller wrote:

Haskell doesn't have type inference at all (it has kind inference though)

Ah I guess we just used different terminology where you mean group types aren't inferred by type constructors are. I would need to think that out a bit...

even if the instance bindings are local, you still have s problem if you want two distinct bindings for the same types in the same context.

Ah so overloading, but how do you select the desired instance within the context? Do you annotate somehow or is the proper overload inferred from context somehow?

Excuse my myopia.

That's just an Ocaml functor which takes a module argument, where modules include both types AND functions.

Seems we are back full circle to the point of what I wrote 6 days ago (as linked from my OP).

In Felix (and I guess Haskell) you can derive a type class from one or more others, but such derivation is not combinatorial. What we want is combinators to systematically add features together and this is really only possible for type classes that have a single type, and functions that accept such type but return an invariant type. In other words, its just OO at the compiler level, with the same limitations (covariance problem).

A code example might help me understand better.

shelby3 commented 7 years ago

@shelby3 wrote:

@skaller wrote:

even if the instance bindings are local, you still have s problem if you want two distinct bindings for the same types in the same context.

Ah so overloading, but how do you select the desired instance within the context? Do you annotate somehow or is the proper overload inferred from context somehow?

I think what you mean is that you don't want to subsume the implementations of the typeclass to the typeclass, but rather track the intersection type of the two available implementations and then monomorphize that statically at the call site based on which implementation the call site requires?

So I think a typeclass that takes the group operation as a type parameter suffices, as the function definition can require a specific group operation instance for the typeclass.

The key insight appears to be my solution to the Expression Problem so you don't lose track of the type, so this delayed binding to instance is possible.

skaller commented 7 years ago

"Ah so overloading, but how do you select the desired instance within the context? Do you annotate somehow or is the proper overload inferred from context somehow?"

You're hitting the point backwards. The point is you CANNOT select the proper instance. Just think of i < j, where you have Up integer order and Down integer order. How can you possibly choose?

In Ocaml you choose because functor applications have nominal types: you have to name them:

UpInteger.less (i,j) or DownInteger.less (i,j).

Since these are modules you can also do:

open UpInteger ... less (i,j) ..

to avoid explicit qualification everywhere. In Haskell and Felix you CANNOT do that because the instance doesn't have a name. It is selected entirely by the type.

A typical work around is like this (in Felix code):

union DownInt = Down of int; instance TotalOrder[DownInt] { ... }

and now

.. Down i < Down j ..

to get the unusual reversed total ordering. The problem is, you cannot add two DownInt. You have to define addition (in Felix code again):

fun + (x: DownInt, y: DownInt) => match x,y with | Down i, Down j => Down (i+j) endmatch;

So really, DownInt's are not integers, until you equip them with all the usual operations. Usually you don't, you just use the Down constructor where you need it to force a distinct typeclass instance but you cannot do that if you are trying to use an existing algorithm that requires Down ordering and 20 other functions defined on ordinary integers but not downwards ones.

So the workaround is only a local workaround. the only real solution is to define another function such as "reverse_less" in the total order type class and then you have to parametrise ALL your algorithms that require a total order to say which order to use.

keean commented 7 years ago

So the workaround is only a local workaround. the only real solution is to define another function such as "reverse_less" in the total order type class and then you have to parametrise ALL your algorithms that require a total order to say which order to use.

The answer is type classes are used to model algebras.

Order is an operation, so you pass a relation as an argument. There 'ralation' would be a type class.

See Stepanov's "Elements of Programming" it's all explained in there.

skaller commented 7 years ago

Sorry I don't have it (Stepanov) and it isn't online so I can't read it (and I'm not rich enough to buy a book with examples in C++). But you have missed the point. With Haskell (or Felix) type classes you can have an algorithm and pass the ordering in as a type class, but you can only have ONE ordering for a given type such as integer. What I explained was that there is a local workaround where you use instead a variant "Down of int" which is of a distinct type so you can construct a different order for it and pass that in. Unfortunately, you can't add those integers. So you have to make a complete copy of the whole model of an integer for Down integers.

Which is entirely nontrivial and in fact probably completely impossible if you have foreign libraries which use integers directly instead of using a type class. The simple fact is most uses of integers are using integers, not a type class based model of integers with instances other than integer.

Unfortunately many types have more or less unique properties and modelling everything with type classes is going too far. In fact one could argue that types are so specialised that to use classes, be they OO classes or type classes, is a waste of time since to get a refined enough view, you would only have one function per class. At that level, composition is isomorphic to a soup of functions: you gain no benefit because you have not abstracted anything.

shelby3 commented 7 years ago

@skaller I have difficulty understanding your explanation. I am unable to be sure if I can guess what you are describing in words, and a complete example might be helpful.

My best guess is you want to compose over functions that employ the less operation, and you want the caller to select which ordering is employed for the entire tree of call hierarchy under a call site.

One solution appears to be what @keean suggests of having all such functions manually input the operation function separately from the data type which is being input. But I am thinking I don't agree @keean that this can't and shouldn't be done with typeclasses. Edit: I see @skaller has made a similar point:

At that level, composition is isomorphic to a soup of functions: you gain no benefit because you have not abstracted anything.

Another way appears to be to select into scope only the implementation instance of the TotalOrder typeclass that the caller wants to employ at the call site, so the compiler will monomorphise the call accordingly. Again as I understand it, Haskell won't let you scope which implementations are available to the compiler since it is global inference process, but I might be mistaken about that.

So I guess I am failing to see why the intended goal can't be handled with typeclasses?

skaller commented 7 years ago

OK, sorry. Here is a simple example, from the Felix library:

fun sort[T with Tord[T]](x:list[T])=> ....

That sorts a list. Now consider something that builds, say, a N-M search tree or something really complicated and messy. A Zen compiler maybe .. :)

Now what can you do if the order is the reverse of what you need? You cannot provide a second type class using the other order for the same type. You're only allowed one, so you can only sort in ascending order.

The problem is you wrote the code making assumptions, and didn't explicitly pass in the ordering function. You use a type class instead and now you're very sorry!

Any single example I give will admit a modification, but the point here is that if you wrote the nasty complicated algorithm using an Ocaml module, you could reverse the order without changing the nasty complicated algorithm. You can't fix it with type classes, you have to either BYPASS type classes to fix it, or, you have to construct a new type, an integer with the usual ordering reversed, and use that.

The latter solution is not workable in more general cases, because you often have simple data types modelling complicated things.

The problem is that tying the algebra to the types is just wrong. Algebras have to be modelled entirely with functional relations using categories (modules in Ocaml) and various constructions including functors (functors in Ocaml). Type classes just don't work. Basing structure (algebra) on types is wrong.

Technically, two types have ONLY one property of any significance: they're either the same type or not. That is the end of it. Types have no other utility than providing identity.

Just for example consider a product like int * string, value (42, "Gitchiker"). you may think this type has structure. WRONG. A product is NOT a type. Its a pair of projections!

What I mean is that a product is characterised by the projections. The actual type has no utility other than the fact it is different to other types. That difference determines which projections can act on it. Its the projections that have the universal properties that define what a product is.

So again: type classes are wrong. Because they base a fixed set of related functions on some specific types, as if the types have some significance.

But this doesn't mean not to put them in Zen. I have them in Felix. Haskell has them. Just know that the model is deficient. Its a good compromise for a scripting language.

shelby3 commented 7 years ago

@skaller wrote:

Now what can you do if the order is the reverse of what you need?

Here is where you lose me. Who is 'you' in the code? The caller of sort?

shelby3 commented 7 years ago

@skaller wrote:

So again: type classes are wrong. Because they base a fixed set of related functions on some specific types, as if the types have some significance.

Afaics, you are not factoring in that the selection of the implementation of a typeclass is not solely determined on the data type being implemented for. As I pointed out before, it is also possible to control the selection at the call site (although apparently not for Haskell).

How else could you possibly do it other than to select a set of operations to pass into the call? What does OCaml do which is better than that?

keean commented 7 years ago

@skaller wrote:

The problem is that tying the algebra to the types is just wrong.

Having written several large programs using both module and typeclass systems, I feel able to comment:

Modules produce a proliferation of boilerplate. In effect if you parameterise a function on modules you have to provide all the modules explicitly. Modules have type parameters, and you rapidly run into the situation where you need to provide the same type parameter to several modules. When you do this you introduce redundancy into the signature, that invites mistakes at worst, or is needless repetition at best.

The advantage of type classes is precisely that they are implicit. When I call '+' on a type, the type provides all the information to know what implementation of plus to choose. We can correctly have a type class for '>' or '<'.

So I am not sure what your problem with type classes is? Sorting is an algorithm, not an algebraic operator.

You have this very problem in English, if I say "sort this class of people by height" what order will I get them back in?

We can only say something is in a ordering, or partial ordering with respect to a particular relation operator.

keean commented 7 years ago

Afaics, you are not factoring in that the selection of the implementation of a typeclass is not solely determined on the data type being implemented for. As I pointed out before, it is also possible to control the selection at the call site (although apparently not for Haskell).

How else could you possibly do it other than to select a set of operations to pass into the call? What does OCaml do which is better than that?

Actually he is right. Type classes select which implemenration to run purely on the type of the arguments (not their values) hence the name type-classes.

Different implementations can get despatched from the call site only because the types of the arguments are different.

skaller commented 7 years ago

I agree with your analysis: type classes are weaker but easier to use. The problem is simple, for a class X there is only one instance for T = A. You cannot pass a function parameter to a type class, the index is exclusively a (list of) type(s). Ocaml functors allow you to also pass functions as parameters.

The theory is simple in principle: a functor is a mapping which takes a subcategory of TYPE as an argument and produces a new sub-category. Anything else is weaker. Type classes are therefore weaker because they only take types to construct a new subcategory. The arrows are missing as parameters.

Its a tradeoff between ease of use and power. Functors are correct, because category theory is correct. It is the only known working model of abstraction and structure. Exactly how to represent them in a programming language is another issue though :)

TYPE is the name I give to the category of types and functions, Haskell calls it asterisk. This is the category we write code for. Denotational semantics is nothing but a functor from TYPE to SET, the category of sets and functions. In Felix you can actually define functors:

typedef fun swap (x:TYPE, y:TYPE): TYPE => y,x;

which is a bifunctor from TYPE * TYPE to TYPE. The effect on the functions is left out, but can be captured with the Functor type class which has method "fmap" in Haskell (it should be called map but they messed up by already defining it :)

shelby3 commented 7 years ago

@keean wrote:

Different implementations can get despatched from the call site only because the types of the arguments are different.

Disagree. I can control which implementation gets dispatched for each type by choosing which implementations I import into the scope of the call site. Which is what I already wrote two times in this thread.

Actually he is right. Type classes select which implemenration to run purely on the type of the arguments (not their values) hence the name type-classes.

Values? Are we asking for dependent typing?

Why does he want to dispatch based on the values?

Did you intend to write polymorphic type?

shelby3 commented 7 years ago

@skaller wrote:

The problem is simple, for a class X there is only one instance for T = A.

I request you please take the time write more coherently. I can't follow your descriptions. What the hell is T = A above?

Please take the time to explain so that readers can understand what you are talking about. From my perspective you are largely writing incomprehensible gibberish. Perhaps @keean can read between the lines, but I can't. Please be more explicit and develop your descriptions with complete code examples.

skaller commented 7 years ago

Shelby, how do you "control which implementations get imported into the call site"?

You cannot do that in Haskell. You cannot do that in Felix. And you cannot do it (legally) in C++ either.

skaller commented 7 years ago

Re T=A, I'm sorry, I didn't proof read, and the idiot markup crud removed stuff. It should have read

class X<T> .. with T = A

meaning the instance of X with T set to concrete type A. The less and greater signs got gobbled.

shelby3 commented 7 years ago

@skaller wrote:

Shelby, how do you "control which implementations get imported into the call site"?

You cannot do that in Haskell. You cannot do that in Felix. And you cannot do it (legally) in C++ either.

Only what you import is visible to the compiler in the file or module you are compiling. C++ has an include.

keean commented 7 years ago

Disagree. I can control which implementation gets dispatched for each type by choosing which implementations I import into the scope of the call site. Which is what I already wrote two times in this thread.

This is true if overlapping instances are allowed. Normally instance sets are required to be coherent, so only one instance globally can match one type, this means changing what you import cannot change the meaning of the program.

We probably want to warn if instances are incoherent.

shelby3 commented 7 years ago

@keean we can't enforce globally and have true Git-style modularity. I explained this to you before and apparently you forgot already, because this concept is apparently foreign to the way you conceptualize. You keep thinking that you can have a runtime loader that enforces some global consistency, but that is the antithesis of decentralized development. You keep forgetting the Second Law of Thermodynamics. Just abandon global consistency and global soundness, as it does not exist. For if such total orders could exist, then we could not exist because the speed-of-light would not be quantifiable.

Why the heck would you want to make all instances of implementations of a data type the same? A per call-site decision isn't a problem afaics. For our TypeScript transpile hack we need to assume they are, but after we write our own JavaScript compiler, I don't see why they need to be coherent.

EDIT Jan. 2018: A linker can insure a total ordering of implementing typeclasses only one way for each data type. My thought above was for hot pluggable code over the unbounded Internet, which require some sort of cryptographic hash id in order to enforce said total ordering in a decentralized context.

keean commented 7 years ago

@shelby3 instances need to be coherent to avoid non-local definitions changing local behaviour.

If instances do not have global coherence then importing a module or runtime loading a DLL could change the behaviour of the program in unexpected ways.

You need global coherence of instances so that you can reason locally about code, without remote changes affecting it.

keean commented 7 years ago

@skaller The point of type-classes is to enable generic programming. When we write a function we want to be able to set various requirements on the types passed in. For example:

copy_bounded(fi : I, li : I, fo : O, lo : O)
    requires Readable<I>, Iterator<I>, Writable<O>, Iterator<O>,
    I.ValueType == O.ValueType

So here we can see the copy should be passed two Readable Iterators and two Writable Iterators, where the ValueType of both iterators is the same.

The things we are passing to the function are Iterator objects, and we want to constrain the types to have certain properties.

So you can understand type-classes as constraints on types.

I don't really see the relevance of your criticisms of type classes.

skaller commented 7 years ago

Ok, so in Felix,

(1) each file is separately parsed, then the parse trees are concatenated based on include directives. Then we have (2) desugaring of terms to more basic ones. Then we have (3) binding/lookup/type checking. During binding, all type classes and instances are collected.Then we have (4) monomorphisation (this is new). (5) Optimisation (6) Code generation (C++)

During monomorphisation polymorphic bindings to type class functions are mapped to instance functions. An error is possible if the instance function does not exist. Instances can be defined anywhere at all. They do not have to be visible from anywhere at all. The typeclass must be visible to the instance.

Felix allows overlapping instance definitions, and they can be polymorphic. If there are several instances definitions providing an instance definition of a type class function, then the one defined in the most specialised typeclass is selected if it exists, otherwise it is an error. Note that in Felix an instance does not have to provide all of the type class functions, this is not an error unless you try to use one you didn't define.

Thus in Felix, if you include a file F containing an instance for a type class, Felix uses that, if instead you include G containing a different instance, Felix uses that, and if you include both it choses the most specialised. However file inclusion in Felix is transitive, has no impact on parsing, and is independent of the location of the include directive: all AST's of included files are concatenated.

If you read that carefully you will see you cannot include a file in any scope other than the global scope, and inclusion can impact both binding and choice of type class instance, but I consider this a design fault in the system, precisely because it allows what is called hijacking, of both ordinary overloading, and also of the kind of overloading type classes provide.

In Felix, type classes do not cause any run time overhead.

In Haskell, a function accepts a dictionary containing an instance at run time. This is necessary because Haskell has separate compilation. Haskell, however, does not allow overlapping instance definitions. Thus, type classes introduce a performance overhead in Haskell.

In Fortress, the choice of the most specialised method is done the same way as in Felix I think, however the choice is made at run time (late binding). Its a very sophisticated OO system with multi-methods based on run time type class stuff.

In Ocaml, module instances are always visible at the point of use but abstraction introduces indirection and a performance overhead (unless the definition happens to be local or the trans-compilation unit inliner fixes it up). However Ocaml classes work differently, they perform run time (delayed) binding of methods. It uses lookup, but hash coding (like polymorphic variants) makes the lookup reasonably fast.

shelby3 commented 7 years ago

@keean wrote:

instances need to be coherent to avoid non-local definitions changing local behaviour.

Can't happen. The locality is in control of what it imports.

If instances do not have global coherence then importing a module or runtime loading a DLL could change the behaviour of the program in unexpected ways.

The locality can import the implementations it desires. If there is a conflict with other imports, the compiler should give an error.

You need global coherence of instances so that you can reason locally about code, without remote changes affecting it.

I don't see why you think so.

keean commented 7 years ago

Can't happen. The locality is in control of what it imports.

If you have a type class with a generic implementation (ie implements a function for a polymorphic type) and you use that in a module, and you import another module that does something useful (maybe written by someone else) that implements a more specific instance of the type-class for some type you are using, then the code will swap to using the new instance instead of the generic one. If the new instance is written in a different way it can change the behaviour.

shelby3 commented 7 years ago

@skaller I am thinking what you want is to be able to pass in a data type and an interface to that data type orthogonally, and so that the binding of other implementations for other typeclasses is delayed and only the ones you have specified are selected early.

f(x: A) where A: Int+TotalOrder =>

@keean I think we should consider adding this functionality combined with the ability to select different implementations for the same data type at the call site. It seems to be very powerful and will work with the unions concept I created to solve the Expression Problem.

keean commented 7 years ago

@shelby3 as I pointed out elsewhere a total order typeclass is underspecified because it does not specify the relation operator that the order is based on.

shelby3 commented 7 years ago

@keean wrote:

and you import another module

Locality is in control of what it imports.

Shouldn't the compiler warn that there is less specific option conflicting? If not, how will you assure coherence any way?

shelby3 commented 7 years ago

@keean wrote:

as I pointed out elsewhere a total order typeclass is underspecified because it does not specify the relation operator that the order is based on.

The relation operator is set in the implementation of the typeclass. That is the entire point.

keean commented 7 years ago

The relation operator is set in the implementation of the typeclass. That is the entire point.

Actually a total-order is not a typeclass at all, it is a precondition (or postcondition).

To test whether a collection is a total order requires knowing what operation the order is based on, and then actually testing the collection with it.

skaller commented 7 years ago

@keean : I do know what type classes do since I use them in Felix :) However they do not introduce constraints on the types at all. This is a wrong idea. What they actually do is introduce signatures into the function scope so the function can be bound. There is a constraint: you have to implement the required instance methods somewhere, but this is a constraint on functions not types.

Let us please separate two things: operation of type classes, and, the categorical view of the world in which types are meaningless points. The latter issue is very high level, and there is no doubt whatsoever it is correct and all models based on anything else are either incorrect or deficient. Categories are the only known modelling tool that can handle abstraction correctly. There ain't no other way. OO certainly fails.

My comments on type class weakness are unrelated to that issue: i merely point out that you can really only have one instance for a given type, whereas with Ocaml modules this is not so.

Type classes provide no functionality. They merely allow the programmer to use a systematic notational convention which is hard to provide with ordinary C++ style overloading. It's important, but its just lookup.

keean commented 7 years ago

There is a constraint: you have to implement the required instance methods somewhere, but this is a constraint on functions not types.

Type classes are definitely a constraint on types.

shelby3 commented 7 years ago

@keean regardless, you are entirely going away from the salient point, which is to bind some implementations early and some delayed, by not discarding the data type when passing in a type class dictionary for that type. In other words, the typeclass dictionaries should be first-class, like a functor in OCaml. That is the point. This abstraction does introduce a performance penalty though.

This is different than a typeclass object, in that we haven't discarded the data type statically.

Type classes are definitely a constraint on types.

Nope. They are requesting a dictionary for a type. It so happens we conflated the passing of the data type and the passing of the dictionary, so it acts like a constraint in the respect.

f(x: A) where A: Int+TotalOrder =>

keean commented 7 years ago

@skaller type classes are equivalent to C++ concepts which are constraints on types.

Consider:

f(x : A) : A requires Show<A>

The function f is constrained such that it only type checks if the type A had an implementation of Show, this can be checked statically at compile time, and therefore the program is proved correct, because any value can be passed at runtime and the type checker has already proved that the type of those values implements Show.

keean commented 7 years ago

@shelby3 wrote:

In other words, the typeclass dictionaries should be first-class, like a functor in OCaml. That is the point. This abstraction does introduce a performance penalty though.

I disagree for most cases because it introduces lots of boilerplate. You are better off using first class functions to pass the relation operator into sort.

Records already provide this functionality, a record can have a value dependent implementation of a function, and you can pass in the record first class to provide the operators you want.

shelby3 commented 7 years ago

@keean wrote:

You are better off using first class functions to pass the relation operator into sort.

Nope. because that can't interopt with typeclass bounds downstream. Must make it first-class as I have pointed out. Please take some time to think before replying.

keean commented 7 years ago

Nope. because that can't interopt with typeclass bounds downstream. Must make it first-class as I have pointed out. Please take some time to think before replying.

I don't understand your point. Please expand.

skaller commented 7 years ago

We could argue terminology forever. But I will tell you point blank, in a function with a given type class required it is not a constraint. It has no impact whatsoever on the type. It does in fact add functions to the scope in which the function is bound. That's precisely what a type class must do. It extends the lookup environment.

You can consider this a constraint only in the sense that a scope containing some functions f1, f2, etc, is extended to add extra functions such as g1, g2, etc. Technically, that is a constraint, but not on the types. Defining a function f: A -> B does not impose a constraint on A or B in my books. In fact you cannot add constraints to types in a categorical world view, it is utter nonsense. Types have no properties, they either exist or they don't, two types are the same or not, but technically that is the whole story.

Indeed type annotations do not specify ANYTHING about types at all. Because you cannot say anything about types. They don't have any properties. They do not have relations. They are just glue that joins functions or not (composition). You can ONLY talk about the relations between functions, never types.

Thus, a product type is a misnomer. There are no product types. A product type is actually a pair of functions called projections. For sum types, injections. In general, the term data type is rubbish. There is no possibility of any such thing. What we actually mean is a collection of methods. And the flaw in OO: the methods are not owned by any type. They're just functions A -> B and that means always there are two types.

keean commented 7 years ago

@skaller wrote:

We could argue terminology forever. But I will tell you point blank, in a function with a given type class required it is not a constraint. It has no impact whatsoever on the type. It does in fact add functions to the scope in which the function is bound. That's precisely what a type class must do. It extends the lookup environment.

I treat them exactly as constraints on types, and use a pure prolog logic engine to solve them. Types translate into logic as atoms, and type classes as rules and clauses, so the function above gets translated as:

show(A)

The logic engine resolves the type classes, using backtracking, as some type class matches can happen fairly deep, for example the rules for adding Peano numbers in the type system using type classes (in Haskell)

data Z
data S x

class Add a b c | a b -> c
instance Add a Z a
instance (Add a b c) => Add a (S b) (S c)

This directly translates into the following Prolog program:

z.
S(x).

add/3 : (in, in, out)
add(A, z, A).
add(A, s(B), s(C)) :- add(A, B, C).
keean commented 7 years ago

@shelby3 here is the type signature for sort_n:

sort_n(f : I, n : DistanceType<I>, r : R)
    requires Mutable<I>, ForwardIterator<I>, Relation<R>,
    ValueType<I> == Domain<R>

So you pass a mutable iterator into the collection to be sorted, the length of the part of the collection to be sorted, and the relation the sort is defined on. Notice we constrain the function passed to be a relation (that is a binary function returning a bool with the domain being the type of the objects stored in the collection).

This is exactly how I think sort should be declared. Algorithms should have "one best most generic" way of implementing them. I imaging in time peer-review of improvements to algorithms like sort with a repository you would access them from.

keean commented 7 years ago

@shelby3

Continuing with the 'records' thing, we can create a record with a function in:

Note I am trying to use elipsis and indenting instead of '{}' for records as it seems more in keeping with the rest of the syntax... but maybe this wont work?

data Ord<A> = Ord ...
    relation(A, A) : A

Now we can define a value:

let gt_int = Ord ...
    relation(x : Int, y : Int) : Int => x > y

You now can pass this as a first class dictionary to a function:

sort(collection, 10 , gt_int)

Inside sort you can refer to the relation:

sort(c, n, ord) =>
   ...
   if ord.relation(x, y) ...

The equivalent to a functor on a module is just an ordinary function on a record.

skaller commented 7 years ago

Right. And Prolog is just unification. But here is the reality, Felix notation sorry:

//1
fun f[T] (x:T, y:T) => add (x,y); 

Fails because add is not in scope.

//2
fun add[T] (x:T, y:T) => ...
fun f[T] (x:T, y:T) => add (x,y);

Works because now add is in scope.

//3
class Addable[T] { virtual fun add: T * T -> T; }
fun f[T with Addable[T]] (x:T,y:T) => add (x,y);

Binds correctly because it is exactly the same as the previous example, except that the function add is added to the scope of the function f, not the containing scope.

//4
class Addable[T] { virtual fun add: T * T -> T; }
open Addable[T];
fun f[T] (x:T, y:T) => add (x,y);

Binds correctly because this is exactly the same as example 2: here the open directive injects the name add into the scope containing f, exactly as if it had been written there.

This localises the scope better:

//5
class Addable[T] { virtual fun add: T * T -> T; }
fun f[T] (x:T,y:T) = { open Addable[T]; return add (x,y); }

So far there are clearly no constraints on any types. All we have done is add functions to the scope.

//6
fun f[T] (x:T, y:T) (add: T * T -> T) => add (x,y);

Works fine, this is just another way to add the function add to the scope of f. Its more general than using a type class because the particular function used can have any name at all, and can vary with each call to f. Irrespective of that, the function f works because add is in scope.

So far, there are obviously no constraints on any types. Now:

//7
class Addable[T] { virtual fun add: T * T -> T; }
fun f[T with Addable[T]] (x:T,y:T) => add (x,y);
var x = f (1,2); 

This binds correctly. It uses unification engine (actually specialisation), which leads to the MGU which sets T->int. This imposes a constraint alright: but not on the binding. The constraint it imposes is that you have to define Addable[int]::add. So that program above fails, this one works:

//8
class Addable[T] { virtual fun add: T * T -> T; }
fun f[T with Addable[T]] (x:T,y:T) => add (x,y);
var x = f (1,2); 
instance Addable[int] { fun add (x:int, y:int) => x + y; }

Yes, there is a constraint. But it is NOT on any type. The constraint is that you have to provide an instance function add: int * int -> int. The constraint is checked during monomorphisation, which uses specialisation (unification, which is the same as Prolog). Again, it is not a constraint on types. It is a constraint on existence of functions. It is a constraint on instances of the type class, not any type. And the constraint is NOT imposed by the type class. It is NOT imposed by using the type class in the function f either. It is imposed by applying f to a pair of values of type int * int.

If you do not apply f to such a value, you do not have to provide the requisite instance. Therefore the constraint is not imposed by the type class, and it is not imposed by the definition of f either, it is imposed by using f on a particular type, and the constraint is not on that type or any other, the constraint is on YOU to provide the instance definition. YOU generated the requirement by using f on a pair of ints.

Note also that the idea there is a constraint "on a type" is clearly nonsense since a type class can involve two types. You would have to say it is a constraint on a pair of types. In fact, as noted, it is not a constraint on types at all. It is an existential constraint that requires a function.

You see, it is nonsense to say we have imposed a constraint on type int. There is no such constraint, the constraint is that you must define add: int * int -> int. If you call that a constraint on int then of course I would have to agree with you. Technically the existence of such a function does provide structure (hence a constraint), but it is not the type int which has the structure but the category of functions to which we added the function add.

For example, we can say a category "has products", and that is indeed a constraint, because not all categories have products. We can say a category "has coexponentials" and that is a constraint because Set does not have coexponentials (for example). But the constraint is on the category. It is not a constraint on a type, which is just an object in a category. It is the WHOLE category we must deal with, and its structure is entirely specified in terms of relations and properties of the arrows (functions). Its nonsense to speak of properties of types. They cannot have properties.

You have to change your world view. Only functions matter. Type checking is just a way to say checking functions can be composed. If you have f: A -> B and g: C -> D then the composition f . g is well defined if, and only if, B = C. That is a constraint on types, and it is the ONLY constraint allowed on types (equality). And in pure categories, there are no types ONLY functions and a set of composition rules. So types, literally, are utter nonsense. They're convenient labels for describing which functions can be composed and which can't, and nothing more.

keean commented 7 years ago

constraint is checked during monomorphisation

And if the type system uses parametric polymorphism (instead of universal quantification) then all types monomorphise, hence it is enforced by the type system.

keean commented 7 years ago

@skaller

The constraint is not on the type Int, the constraint is on the type-variable and is part of the type-signature. Constraints only make sense on polymorphic functions and limit the types the polymorphic functions will accept as type parameters.

Actually if you view the type of a program as a theorem, then the program is the proof. You can have constraints in the theorem, and those constraints are part of the theorem, hence are constraints on type-variables. They are not part of the proof.

The program is the proof, so if an instance for Show exists for A it proves the theorem f(x : A) requires Show<A>.

So the constraint is on the type-variable, it is part of the theorem.

skaller commented 7 years ago

@shelby3 : you have an example with ForwardIterator<I>. This is C++ nonsense. The correct formulation must be ForwardIterator<I,V> where V is the value type. The type class requires both types or the dereference method cannot be specified.

keean commented 7 years ago

@skaller wrote:

you have an example with ForwardIterator. This is C++ nonsense. The correct formulation must be ForwardIterator<I,V> where V is the value type. The type class requires both types or the dereference method cannot be specified.

The value type is an associated type. (effectively a type-family tied to the type-class). It is an output type, whereas the type parameters are input types. In other words the type parameter of the the iterator type-class uniquely determines the value type.

shelby3 commented 7 years ago

Let me re-explain what I was thinking last night. Please let's respond with code examples, so that we can see what the heck each other talking about.

So in one module we have:

module First {
   typeclass LessThan<A>
      less(A, A): Bool
   typeclass Sort<T<A>> extends LessThan<A>
      sort(T<A>): T<A>
   List<A> implements Sort<List<A>>
      sort(list) => ...

   g(x: T<A>): T<A> where Sort<List<A>> => x.sort()  // don't write `where Sort<List<A>> + LessThan<A>`
}

And in other module:

import First._

Int implements LessThan<Int>
   less(x, y) => x <= y      // forward ordering

f(x: T<A>) where A: Int, LessThan<A> => g(x)  // Note change of syntax so that `+` needed join multiple within the comma-delimited list of `where`

And in another module:

import First._

Int implements LessThan<Int>
   less(x, y) => x >= y      // reversed ordering

f(x: T<A>) where A: Int, LessThan<A> => g(x)

Note how the selection of the implementation of Sort is separate from the LessThan. I think this is what @skaller wants in terms of orthogonality of composition.

Essentially afaics what the above is doing is combining a dynamic dispatch typeclass object for the LessThan with a delayed monomorphized binding on the Sort.


Instead of:

module First {
   typeclass LessThan<A>
      less(A, A): Bool
   typeclass Sort<T<A>> extends LessThan<A>
      sort(T<A>): T<A>

   g(x: T<A>): T<A> where Sort<List<A>> => x.sort()
}
import First._

List<A> implements Sort<List<A>>
   sort(list) => ...
Int implements LessThan<Int>
   less(x, y) => x <= y      // forward ordering

f(x: T<A>) where Sort<T<A>> => g(x)

Or instead of:

module First {
   typeclass Sort
      sort(): This

   g(x: T): T where T: Sort => x.sort()
}
import First._

List<A> implements Sort
   sort() => ...

f(x: T) where T: Sort => g(x)

Obviously all of those forms except the last, do not allow my preferred OOP style.

shelby3 commented 7 years ago

Let me catch up on your comments which followed while I was sleeping; and relate my vague thoughts from last night dumped above to what you all have written hence.

@skaller wrote:

You can consider this a constraint only in the sense that a scope containing some functions f1, f2, etc, is extended to add extra functions such as g1, g2, etc. Technically, that is a constraint, but not on the types. Defining a function f: A -> B does not impose a constraint on A or B in my books. In fact you cannot add constraints to types in a categorical world view, it is utter nonsense. Types have no properties, they either exist or they don't, two types are the same or not, but technically that is the whole story.

I think I probably agree with this from one perspective. The typeclass request isn't constraining what the data type is. The data type doesn't change based on the typeclass request. Rather it is constraining the set of dictionaries that must be in scope.

Edit: I see you explained the same point.

Indeed type annotations do not specify ANYTHING about types at all. Because you cannot say anything about types. They don't have any properties. They do not have relations. They are just glue that joins functions or not (composition). You can ONLY talk about the relations between functions, never types.

How about combining nominal and structural typing? The nominal type now has relevant members for structural typing. Is that just a projection also mapping the nominal to the structural projections?

Thus, a product type is a misnomer. There are no product types. A product type is actually a pair of functions called projections. For sum types, injections. In general, the term data type is rubbish. There is no possibility of any such thing. What we actually mean is a collection of methods. And the flaw in OO: the methods are not owned by any type. They're just functions A -> B and that means always there are two types.

Okay whether we refer to them as morphisms or types, does it matter to our type checking and inference engine?

My category theory math is basically nil, so I won't be of much use here.