Top and Bottom - Githubissues

felix-lang / felix

The Felix Programming Language

Other

806 stars 44 forks source link

Top and Bottom #113

Open skaller opened 6 years ago

skaller commented 6 years ago

With the introduction of subtyping, natural representations of Top and Bottom exist: type any, given by X as X, is Top, whilst type void is Bottom.

Unfortunately, since Bottom is a subtype of everything, this would allow a procedure call to be used in an expression, in fact, worse, as the argument in the position of a parameter of any type at all, if subtyping were extended to these values. It seems more correct that procedures have type Top, which would prevent calls to them being passed to a parameter of any type other than Top.

Since we do not have any information about a variable of type Top, that variable cannot be used other than to copy it, allowing the variable to be elided. A call of a function returning Bottom, on the other hand, can be passed to any parameter, and this can only be safe if the function never returns.

Unfortunately, these interpretations belong in a functional type system. Felix is not just a functional language. In Felix, functions are not allowed side-effects, and we used a trick to prevent procedure calls being passed to functions: they return void.

On the other hand non-returning routines like exit() have been types as any. It seems backwards, they should be typed void, and returning procedures typed any. However this still allows functions to take applications of non-returning functions and applications of procedures as arguments, that is, to use them in expressions.

Generators already allow this anyhow.

The problem is that expression notation just doesn't generalise. We expect referential transparency, totality, and purity. Both totality and purity are not required at present. Non-side effecting doesn't apply with generators either, although the semantics are now limited (in my mind anyhow) to code that only modifies internal state (that is, which can only be observed in the result of an application).

In my mind, however, void (0) is natural for something which returns, but returns no value, whereas something which is an infinite loop looks very non-returning to me.

The upshot, in some ways, is that the functional model is utterly corrupt. Even subroutines are cheats. Continuation passing seems like a better basis, whether or not convenient notation exists.

Already, uniq types fight with the functional notation: what is naturally a copy is hacked by the type system to be a move. These concepts aren't functional, move implies locations. Now throw in fibres and coroutines.

sighoya commented 6 years ago

@skaller

Why you want void to be bottom?

void is a singleton type containing only one instance. If you want, this instance could be bottom. A function returning bottom never returns as this would instantiate the bottom type which is illegal as it has no instances.

I wouldn't differentiate between procedures and functions, functions returning void are valid from set theoretic point of view. For procedures biasing the return value for the same args could be transformed into functions passing the return value as argument which is deception, but it is sufficient.

You could also provide side effecting sub functions as argument of your enclosing function such that you can overwrite effects but this messes up with up propagation for all nesting effects.

The gain of overwriting effects is questionable if you don't know the ingredients of a function and commercial developers wouldn't present that at all.

The gain of a full effect system would be to determine if two functions are really equal in their computation and their effects, but anyway you can't determine equality of two general functions.

In my eyes, I would gather effects in metadata such that the IDE can display what happens but I wouldn't penetrate users to encode them.

skaller commented 6 years ago

In Felix unit is a type with one value, void is an uninhabited type with no values. It may be different in C/C++ depending on how you look at it. Your second paragraph is contradictory: first you say void has one instance, and then say instantiating it is illegal because it has no instances.

Felix differentiates between functions and procedures for several reasons. For C/C++ compatibility functions put the return address on the machine stack. Procedures use a heap allocated linked list called a spahetti stack. So the representations are utterly different.

In addition, semantically, procedures have to have side effects, or they're useless because they don't have a return value to encapsulate a result, whereas functions do. In Felix, functions are not allowed to have side-effects. They don't have to be pure though. Also there are things called generators which may have internal side effects, that is, they can modify their own state so that they return a different value if called again. Iterators are generators, so is the rand() function.

For better or worse, generators have the same type as functions (at the moment). That's an issue. Technically Felix has a single type:

D -> [E] C

for functions, generators, and procedures, where D is the domain, C is the codomain, and E is for effects. However the E is not used anywhere (at the moment). Felix actually uses the notation:

D -> 0

for procedures. From a category theoretic view, function from void are also fine, you just can't apply them. Originally the idea was that "functions" returning void cannot be applied, because they don't return a value, but they can actually be called with a distinct syntax, because they do return control. The idea was to have referential transparency or something like it by isolating effects to distinct syntax: expressions had to be purely functional, no effects, statements can have effects.

But generators break that rule now. The reason is that the syntax otherwise requires

var x: int; // uninitialised variable!
call p &x 42;

instead of just

var x  = p 42;

and i haven't found a better syntax!

BTW: I don't believe any such thing as a full effect system can exist because "effects" is a grab bag of completely distinct things, in fact everything in computing other than pure functions. If you're only dealing with say mutations to store, a full effect system may be possible which can hopefully erase effects in a sufficiently large "region" (i.e. allow effects in "functions" provided there is some larger function from which they don't escape).

In effect, generators roughly fit your description: the type system doesn't know about effects they have. The rules (not enforced) say only internal effects, but it remains a fact that using generators destroys transparency, but the type being the same as a function also prevents the type system enforcing transparency of higher order functions (you can pass a generator where a parameter has function type, into say, a fold or map).

sighoya commented 6 years ago

Your second paragraph is contradictory: first you say void has one instance, and then say instantiating it is illegal because it has no instances.

No, instantiating bottom. But when you treat void as bottom, ok, but the I don't understand this:

Unfortunately, since Bottom is a subtype of everything, this would allow a procedure call to be used in an expression

Can you elaborate more to this?

It seems more correct that procedures have type Top, which would prevent calls to them being passed to a parameter of any type other than Top.

Such decisions have unforeseeable consequences for later subtyping relationships.

I could imagine the following scenario:

fun(f:Int->Int,a:Int)=f(a)
f:(never:Bottom)->Bottom= throw Exception("")
fun(f,2) //illegal because of contra variance
g:(a:Int)->Bottom=throw Exception("");
//or
h:(a:Int)->Bottom=while true;
fun(g,2) //legal
fun(h,3)//legal

So whats the problem?

If you're only dealing with say mutations to store, a full effect system may be possible which can hopefully erase effects in a sufficiently large "region" (i.e. allow effects in "functions" provided there is some larger function from which they don't escape).

It depends what you mean with effect system. Does it include to overwrite effects or only to show effects in signature, the later is much more implementable.

Again, I don't see the gain in overwriting effects.

In Felix, functions are not allowed to have side-effects. They don't have to be pure though.

Means they have to be total as mathematical functions requires it, but they can mutate locally or globally. So they are allowed to produce effects but not to consume them. I think that is what I want, too. And optional declaring purity seems reasonable to me.

skaller commented 6 years ago

"Unfortunately, since Bottom is a subtype of everything, this would allow a procedure call to be used in an expression

Can you elaborate more to this?"

Sure. If a function accepts a value of type T, then it also accepts a value of any type U which is a subtype of T. Since void is a subtype of all types, and in Felix a procedure has type X -> void, a procedure application to a value of type X can be used as an argument expression for any function.

But the whole idea is that expressions shouldn't have side effects. For example:

proc p (x:int) { println$ "Hello"; } // type int -> 0
fun f (x:int) => x * x;
println$ f (p 42); // type checks!!

Quite apart from the expression f (p 42) having a side effect because p 42 does, the code will probably crash. The problem is that I stole the function notation for procedures and just made them return void (0). Felix does NOT have this problem at the moment because void isn't recognised as bottom (in fact there is no bottom).

The thing is, returning no value is NOT the same as not returning control at all. Procedures return no value but they do return control. An non-return routine doesn't return control (and of course therefore no value).

Felix uses type any for a non-returning routine which is defined as:

typedef any = any;

In fact this type is TOP! In Ocaml, they use a type variable (because it unifies with anything). So does any since all types are subtypes of top (every type satisfies the equation recursive equation above in vaccuuo!)

The real problem is that theories developed for functional behaviour simply are not adequate for mixed imperative/functional programming. Procedures aren't functions.

sighoya commented 6 years ago

Since void is a subtype of all types, and in Felix a procedure has type X -> void, a procedure application to a value of type X can be used as an argument expression for any function.

This is the point where you need exceptions, if you don't like them you can abort the whole program.

and in Felix a procedure has type X -> void

This is the point I don't understand about Felix. Procedures do return, but they return of type Unit which makes them total and therefore capable to use them in expressions.

skaller commented 6 years ago

Procedures deliberately do not return unit because that is a design fault.

The problem is grafting imperative ideas such as effects onto a functional model. In Felix, functions returning unit are universally eliminated. Its guaranteed. Functions aren't allowed side effects, so the application of a function to an argument returning unit is replaced by a unit value:

fun f (x:int) => ();
fun g() => 99;
var x = f 42;
println$ g (f 42); // 99

The code is first optimised to this:

fun f (x:int) => ();
fun g() => 99;
var x = (); 
println$ g x; // 99

Now, f isn't used, so we can eliminate it:

fun g() => 99;
var x = (); 
println$ g x; // 99

Now, since x has type unit, we can replace it in an expression with just ():

fun g() => 99;
var x = (); 
println$ g (); // 99

Now x isn't used so we can eliminate it:

fun g() => 99;
println$ g (); // 99

We can inline g:

fun g() => 99;
println$ 99; // 99

and now g isn't used so we can eliminate it:

println$ 99; // 99

Of these transformations:

elimination of unused variables is guaranteed, including their initialisers
replacement of applications of type unit with unit value is guaranteed
the system is supposed to eliminate unused functions
inlining of direct application of functions marked inline (unless recursive) is guaranteed

Note that the initialiser of a variable can be or contain a generator and these too are guaranteed to be eliminated. This particular assurance is a critical design choice:

var x = thread_pool();

If you don't use the thread pool x, x is eliminated and so is the call to the thread pool constructor. Similarly:

var clock = alarm_clock();

If you don't use the clock, it is eliminated. This is important because alarm clocks use the asynchronous I/O subsystem which starts a monitoring pre-emptive thread that polls kqueue, epoll, etc for events. Async handling in Felix is loaded on demand. Its vital not to uselessly create that demand.

The point is expressions in Felix are supposed to be purely functional and referentially transparent. If you want imperative code that's fine, Felix has the best control flow management tools of any programming language in existence, but to use it you have to use the control flow framework which is represented syntactically by statements.

In reality there are some issues. First, functions not marked pure may depend on variables. Although the function cannot change the variable, so the value is immutable during evaluation of expressions, referential transparency is lost, because the function isn't pure: it depends on a variable which can take different value at a different time.

Secondly generators look like functions, they're not pure because they depend on an internal variables. In fact, you can write one that depends on an observable variable too.

So the idea that procedures return unit is a disaster. It is in Ocaml too. You can write this rubbish:

let f x = print_endline (string_of_int x);;
let g () = print_endline "Hello";;
g (f 42);;

This is plain garbage. You're calling a function which takes an application as an argument, which has a side effect. The right way to get the same result would be:

f 42;;
g ();;

because the sequencing is now explicit, not a side effect of the fact Ocaml uses eager evaluation. To stop this, the "result" of a procedure call must be such that no function can accept that result as an argument. Unit, clearly will NOT do.

What we actually want is TOP. If a procedure returns TOP, then an application of the procedure cannot be passed to any function, unless the function parameter has type TOP. Unfortunately just banning functions with parameter type TOP won't work because of polymorphism. A better method is to ensure there's no way to write "TOP".

Unfortunately, this is a specification of TOP:

typedef any = any;

Its a perfectly legal recursive type. The implementation is a pointer that points to itself. This is correct: if you try to do a recursion on it, your code goes into an infinite loop, which is exactly what we want, because it prevents returning a value. Values of type TOP are useless because, since we don't know anything about the type, we can't do anything with them.

What SHOULD happen in Felix is that, if a function has a parameter of type TOP, then since it cannot be used, it isn't used, and then, the parameter is eliminated. Similarly since the result of a function returning TOP is useless, the application is eliminated, along with any side-effects. This means, the only way to actually use a procedure is to call it with a special statement.

The problem is I chose BOTTOM (void, 0, empty set) instead of TOP, which is totally wrong because a function application of type BOTTOM can be used for ANY argument of a function because BOTTOM is a subtype of EVERY type.

Felix actually uses type "any" or TOP for functions that do not return at all (eg exit(), throw, etc). The idea there is that if you really want to throw an exception or otherwise exit a functional concept, you need a type that unifies with anything. Again, I got it backwards. I should have used BOTTOM (void) for that because it is a subtype of anything.

skaller commented 6 years ago

Just to be clear: if you make procedures return unit, then just as you say, they can be used in expressions. This is precisely what I want to disallow.

Secondly: Felix has no support for exceptions. This is deliberate. The C++ kind of dynamic, type based exception handling is a disaster.

Ocaml is slightly better. First, you cannot throw anything, only a single exception type which is an open variant for which you have to declare the constructors. This is still bad, but not quite so bad, because at least you can "grep" source code for the declarations.

Secondly, in recent Ocamls, you can do local exception declarations. The compiler can actually tell if a type escapes it context and that is disallowed. Therefore throwing an local exception is guaranteed statically to not escape the scope of the declaration, and hence, if you do throw such an exception, you are forced to actually catch it.

This is much better because it reduces the scope of the problems of exception handling, i.e. it localises it, but the idea is still completely and utterly wrong. This is NOT a well principled way to escape the current context.

If you want to throw an exception, generally it is NOT an error. It is a condition you have detected which you might consider abnormal in some sense, but the program code should just consider the handling of it the same way the normal result is handled. The machinery must be symmetric.

One way to do this is of course to return a variant. Often we use option types for that, but you can use a more precise type:

union open_result = 
 |  Filehandle of int
 | Failed of errorcode
;

This is how C handles it, although it doesn't have good support for variants. The problem with this method is you have to short cut evaluations and pass the result up what could be a deeply nested computation. Exceptions allow you to escape immediately so that the usual flow just handles the normal result, but as I said this model is plain wrong.

The right way to handle this situation is to recognise that a subroutine doesn't return a result, it passes its result to a continuation. Once we have this idea the answer is obvious: if you have two possible kinds of result to handle, you have to pass the routine two continuations.

In Felix, there is a high level way to do this using coroutines. You just provide the coroutine with two output channels, one for the normal result, and one for the abnormal result. Its only in your head that one is normal and the other not: the code just treats these channels symmetrically. Suspending on channels is done by placing the current continuation of the routine doing the I/O operation onto the channel, so in effect, channel I/O provides a high level way to do continuation passing.

However this is a very high level model. Several lower level models are also provided. One is to just pass a continuation for the abnormal case explicitly to a procedure. The normal continuation is also passed but that is done implicitly. The procedure can then just call the error continuation to exit if necessary. This method does require explicitly passing the error handling procedure down the call chain, but this is much MUCH better than exception handling because the type system enforces the passing down of the error handler.

A variant of this allows you to throw a continuation, then the scheduler itself invokes it after discarding the current context. Its not exception handling because there is only ONE type that you can throw, and the catch is implicitly right at the top of the stack, always. You cannot forget to catch the continuation, because, quite simply, you cannot do so anyhow!

None of these mechanisms are perfect, particularly as in Felix there is dual model at work: functions (using the machine stack) and procedures (using a spaghetti stack) and corresponding distinct and separated semantics.

But this is the design model of Felix. We definitely want purely functional code, but we also want imperative code with explicit control flow, mutation, etc. The problem is how to integrate the two models and the way forward is to get stuff that works but is a bit crude, and then try to formulate a symmetric algebra that integrates things better, and then implement that.

The real problem here is that set theory, type theory, and category theory to some extent support functional concepts well but there is very little known about imperative models and even less about how to integrate the two. My own belief is that the two MUST be handled SYMMETRICALLY.

In more restricted domains this has been done, but few recognise the importance of this work. For example read "Symmetric Lambda Calculus" in which we have values, functions AND continuations. The result is symmetric handling of both inductive and coinductive types.

In my view, there functional model of coinductive data types is plain wrong. The point is that coinductive types are NOT data types at all, they do not describe data, they describe EVENTS. A stream isn't an infinite list, it is an event sequence, and the ONLY way to handle a stream require a coroutine. You CANNOT do it with functions. The point is we have a program space with two dimensions: space (address space of memory) and time (clocks sequencing events). Any computation model that fails to use this dual coordinate system is wrong.

sighoya commented 6 years ago

At first, sorry for the long break.

Now to the content you provide:

So the idea that procedures return unit is a disaster. It is in Ocaml too. You can write this rubbish:
let f x = print_endline (string_of_int x);;
let g () = print_endline "Hello";;
g (f 42);;
This is plain garbage. You're calling a function which takes an application as an argument, which has a side effect.

Why? I don't know ocaml enough, but if these functions return unit or something else, then they are expressions. This is an example of causing effects. Either you would model this with an implicit effect parameter passed to print_endline or you ignore it such that the mathematical view of functions with causing effects ignores effects at all.

Tell me why you want to list effects at all, what does you gain from this, it really interests me. Effects tell that some state outside the scope will be changed or consumed, but this is unnecessary as mutable states are well defined mathematical entities. They are variables which get rebounded over time, each access of them occurs at a different time. So the question is if mutability exists at all.

What SHOULD happen in Felix is that, if a function has a parameter of type TOP, then since it cannot be used, it isn't used, and then, the parameter is eliminated. Similarly since the result of a function returning TOP is useless, the application is eliminated, along with any side-effects. This means, the only way to actually use a procedure is to call it with a special statement.

But this violates the purpose of Top as it is the supertype of every type. A parameter of type TOP should accept everything.

The problem is I chose BOTTOM (void, 0, empty set) instead of TOP, which is totally wrong because a function application of type BOTTOM can be used for ANY argument of a function because BOTTOM is a subtype of EVERY type.

Wrong, function arguments are contravariant and because bottom is the subtype of everything, no function accepts bottom typed arguments except a function taking bottom as argument which cannot be called however.

Just to be clear: if you make procedures return unit, then just as you say, they can be used in expressions.

Again, what is the problem with this?

This is much better because it reduces the scope of the problems of exception handling, i.e. it localises it, but the idea is still completely and utterly wrong. This is NOT a well principled way to escape the current context.

Of course it is, never returning is a property which bottom-up propagates, i.e. if a function is in error state, then the enclosing function as well as the enclosing enclosing function is and so on.

If you want to throw an exception, generally it is NOT an error. It is a condition you have detected which you might consider abnormal in some sense, but the program code should just consider the handling of it the same way the normal result is handled. The machinery must be symmetric.

This is one point. The other point is that errors shouldn't be considered as types at all in PLs, so it is wrong to return an error. But I know what you mean. Just to say it, don't regard exceptions as a violation of it, instead exceptions make error handling more convenient as they implicitly handling errors just as you described it but in difference to manual error handling they jump to a position of some enclosing function where the error is handled once.

The problem with manual error handling is bottom up propagation. Each superfunction must check the result and have to transmit the error upwards, this sucks and has bad performance because the return type is no a disjoint sum or union type and have to be type cased.

In more restricted domains this has been done, but few recognise the importance of this work. For example read "Symmetric Lambda Calculus" in which we have values, functions AND continuations. The result is symmetric handling of both inductive and coinductive types.

Ah, If I have more time I will look onto it. Are continuations better than exceptions?

In my view, there functional model of coinductive data types is plain wrong. The point is that coinductive types are NOT data types at all, they do not describe data, they describe EVENTS. A stream isn't an infinite list, it is an event sequence, and the ONLY way to handle a stream require a coroutine. You CANNOT do it with functions.

You talk from the low level point. But they are infinite from the abstraction point.

The point is we have a program space with two dimensions: space (address space of memory) and time (clocks sequencing events). Any computation model that fails to use this dual coordinate system is wrong.

Could you explain? With space and time you could turn every procedure into a function. Generators are then functions which take time and address and return a result of it even if they used other variables internally. So I don't see anymore a reason to consider non function constructs.

skaller commented 6 years ago

The reason one does not want effects in expressions is referential transparency. A lot of math fails if you have effects. This one was pointed out in an online lecture:

By definition a pair of two types A and B is a type P with two functions pi1:P -> A and pi2:P-> B called projections, such that for ANY two functions f1: X -> A and f2: X -> B, there is a unique function h: X -> P such that f1 = h . pi1 and f2 = h .p2 (where . is reverse composition).

Now suppose f1 has a side effect. Then since p1 does NOT have a side effect, h must have a side effect. But then if f2 does NOT have a side effect, it cannot equal h . p2 because h does have a side effect.

In other words, core algebraic constructions like products require functions to be side effect free. Therefore, referential transparency is mandatory.,

In Ocaml, which allows functions to have side effects, tuples are broken for precisely this reason. IN fact, they're broken anyhow because of eager evaluation (if one of the functions throws then the product never gets formed). Haskell has real products because the computation is not performed unless required.

So in Felix, functions are not allowed to have side effects, so expressions cannot have effects. However unfortunately, functions don't have to be pure (they can depend on variables).

So if you want effects, you have to use a procedure. Procedure MUST have effects because they don't return a value (a procedure without effects can be erased because it doesn't do anything).

So to cleanly separate functional and procedural code, procedures are called with statements, and cannot be used in expressions. At least that was the basic idea. However procedures in Felix use the function type X -> 0 to distinguish them from functions. And in a function nothing stops you doing an effect. The compiler assumes you don't, and optimises based on that assumption.

In particular a function returning a unit value can always be erased, since there is only one unit value and we know what it is, so the function is useless.

skaller commented 6 years ago

Top: I agree, a parameter of type Top should accept everything. However you cannot DO anything with such a parameter except copy it, so in principle you can delete the parameter and any argument in that parameters position.

I think you misread what I said:

"a function application of type BOTTOM can be used for ANY argument of a function because BOTTOM is a subtype of EVERY type."

In other words, suppose f: int -> BOTTOM, and g: int -> string. Then this is well typed:

   g ( f 1 )

because the application of f to 1 has type BOTTOM, which is a subtype of int, so g can be called on it.

The code types correctly, but I don't want it to. I should have used TOP for the return type of a procedure not BOTTOM.

Of course there are no values of type BOTTOM. As you say a function of type BOTTOM -> int is perfectly good, it just can't be called. In fact it is the characteristic function of the type int! Because there is exactly one function from BOTTOM to int. You can't call it, but you can pass it around, it might be useful to fix a type in a polymorphic function, dunno.

Any the problem is: what can I do??? Felix uses the wrong type, and not just in the compiler, in the library and all the documentation. At present the unification algorithm doesn't recognise void (0) as a subtype of anything (but it should!).

skaller commented 6 years ago

On error handling: passing back disjoint unions works but as you say is clumsy.

The problem is that a function, when returning, is actually CALLING the continuation of the master. If you want to do something else on some abnormal condition, you have to pass a second continuation which is called for that case.

So the problem is that the subroutine model of functions (and procedures) is broken. Throwing an exception is a HACK because catching it is not assured. That's in theory, in practice it is FAR FAR WORSE.

What you can do in Felix procedures is:

proc stuff (x:int) (e:string ->0) { ...
   if x == 0 do  
     e "zero error";
  else 
    println "Not zero";
    return;
  done
}

in other words pass in a routine to call if there's an error. Of course you do not need to do that if the error handler is in scope.

This is not symmetrical because the normal operation passes the current continuation implicitly and return statement calls it, but the error routine has to be called explicitly.

With coroutines, its symmetric:

  while true do 
    x =read input;
    if x =0 perform write (error, "x was zero");
   else perform write (normal, "x not zero);
 done

Coroutines don't return control AT ALL so there's no question of a normal return. Instead, we have two channels, one for the normal case and one for the error case. We connect those channels to other coroutines that handle the cases. Those handlers are suspended (on the channels) waiting for data, in other words, writing on a channel is calling the readers continuation.

The point is there are TWO continuations waiting here, at the other end of the channels.

sighoya commented 6 years ago

By definition a pair of two types A and B is a type P with two functions pi1:P -> A and pi2:P-> B called projections, such that for ANY two functions f1: X -> A and f2: X -> B, there is a unique function h: X -> P such that f1 = h . pi1 and f2 = h .p2 (where . is reverse composition).

Now suppose f1 has a side effect. Then since p1 does NOT have a side effect, h must have a side effect. But then if f2 does NOT have a side effect, it cannot equal h . p2 because h does have a side effect.

This is any way contradictory as we cannot always compare functions structurally and have to compare them extensional, i.e. each function becomes unique to all other functions and itself.

So to cleanly separate functional and procedural code, procedures are called with statements, and cannot be used in expressions. At least that was the basic idea. However procedures in Felix use the function type X -> 0 to distinguish them from functions. And in a function nothing stops you doing an effect. The compiler assumes you don't, and optimises based on that assumption.

This is was sucks me as it creates a bitheistic world. For that reason I prefer to cheat procedures as functions which get their return value as parameter. But this would require to always return values of non unit type.

I think you misread what I said:

"a function application of type BOTTOM can be used for ANY argument of a function because BOTTOM is a subtype of EVERY type."

Sorry, my fault.

Any the problem is: what can I do??? Felix uses the wrong type, and not just in the compiler, in the library and all the documentation. At present the unification algorithm doesn't recognise void (0) as a subtype of anything (but it should!).

Can't you change this as a language creator or is it not possible because of backward compatibility?

Coroutines don't return control AT ALL so there's no question of a normal return. Instead, we have two channels, one for the normal case and one for the error case. We connect those channels to other coroutines that handle the cases. Those handlers are suspended (on the channels) waiting for data, in other words, writing on a channel is calling the readers continuation.

The point is there are TWO continuations waiting here, at the other end of the channels.

Very interesting, so what you do here is quite similar to exceptions except that the use of coroutines are a more customizeable. So coroutines are a functional high level form of goto jumps.

skaller commented 6 years ago

"This is any way contradictory as we cannot always compare functions structurally and have to compare them extensional, i.e. each function becomes unique to all other functions and itself."

This isn't relevant. We're not doing a run time comparison of the functions, we're examining the mathematical definition of a product. In fact, the whole theory of abstraction, category theory, is an equational theory of function composition. Equational meaning, the only question one can ask about functions is whether they're equal.

In this case, we are actually examining the language mechanism to form a product. In Felix and many other languages including Ocaml, we use one or more comma operators:

1,"hello",42.1

to construct a tuple. This has the type

int * string * double

and in Felix at least the projections are:

proj 0 of (int * string * double)  // int * string * double -> int
proj 1 of (int * string * double) // int * string * double -> string
proj 2 of (int * string * double) // int * string * double -> double

The rule says that for any functions:

f0: X -> int
f1: X -> string
f2: X -> double

there is a unique function h such that

f0 (x) = proj 0 (h (x))
f1 (x) = proj 1 (h x))
f2 (x) = proj 2 (h x))

In other words, instead of applying f0 to x, you can apply h first, the apply the function h first, then apply projection 0 and get the same result.

In Felix h can be written like:

\langle f0,f1,f2 \rangle   // <f0,f1,f2>

In any case the rule crudely says "an integer in a tuple is the same integer as one outside a tuple".

The point is that to obey the rule is a constraint on the programming language. The designer (me in this case) has to make sure the implementation obeys the rule.

Felix tuples obey the rule provisionally: the function has to be pure, total, and side-effect free. Total just means that the actual domain of the function is described by the function type. Side effects aren't allowed. However functions don't have to be pure. Note the function cannot throw (Felix has no exceptions) and it can't fail unless its a programming error. Side-effect freedom is not enforced though.

skaller commented 6 years ago

"This is was sucks me as it creates a bitheistic world. For that reason I prefer to cheat procedures as functions which get their return value as parameter. But this would require to always return values of non unit type."

I agree completely that the bitheistic world sucks. I've been looking for a way to fix that.. for 30 years. I prefer not to cheat .. well not too much.

The problem with allowing effects in expressions/function is that the order and interaction of the effects is indeterminate, so it is a bad model for managing effects, whereas explicit control flow constructions are a good model. The indeterminism is ESSENTIAL because it is the only way the compiler can optimise the code to produce things that run fast.

Felix aims to run code at HYPERLIGHT SPEED. In other words FASTER THAN C. :-)

Anyhow, there is hope. There's good paper "Symmetric Lambda Calculus" which shows how to symmetrically integrate values, functions, and continuations.

The "bitheistic" world in Felix is not between functions and procedures, but functions and coroutines. And you can "map" between the models. Somehow the two models are dual, or perhaps adjoint. Coroutines are basically functions turned inside out. I call the control inversion.

skaller commented 6 years ago

Let me try to explain how continuation passing works.

Suppose we have a program:

set (x, 1);
print x;
print "bye";

I've written it with procedure calls. Now to translate that to continuation passing, we use functions which correspond to the procedure but take an extra argument, the continuation.

A continuation is defined as "the whole of the rest of the program". So here's the code again in Continuation Passing Style (CPS):

set (x,1) (
  print x (
    print "hello" (
      halt)
  )
)

Here's how to do a conditional: instead of

if c do t else f done;
print "done";

you write:

if c 
  (t (print "done" (halt))) 
  (f (print "done" (halt)))

Note this function takes TWO continuations not one.

We can simplify the code with a goto. This code:

goto label;
print "unreachable";

becomes just

goto () (label)

Here, goto has no arguments, and instead of writing the actual code out that is at label, we just write label. In other words a label is just the name of a continuation, it names the continuation starting where the label is written.

Note that goto is now purely functional.

So following the conditional example, code that would throw an exception is exactly the same as a conditional: it takes two continuations.

Note that with CPS, each "function" takes a continuation argument and when it is finished jumps to it. Of course to do the jump, we can use a tail recursive function call. We do that because some "functions" return values. In that case the continuation is a function that accepts the return value.

So we can fix "goto" now, and indeed all the "procedure calls" so that they "return" say "unit", and make all continuations accept an argument.

So now the type of a function previously denoted:

f: D -> C

is replaced by a function denoted:

f: D -> (K -> C -> ....)

where K is the continuation which accepts the return type C as an argument.

Now, if you want to turn your brain into spaghetti, given a sequence of procedure calls, to convert to CPS, with the first statement now we need the type of K. Since the K not only accepts an argument but ALSO accepts a continuation (which accepts its return value), we can elaborate K by looking at the next statement. But then, recursively, to find the type of its continuation we have to figure out ITS type.

The way you do continuation passing is BACKWARDS. You actually have to start at the END of the program, figure out all the ways you can get there, so you can pass the type of the final continuation (halt) up. Having figured those continuation types out, you ask how you got to them, and so on, until you reach the start of the program. All the paths MUST lead to the start of the program (or you couldn't have got there).

Normally writing CPS code you use type inference to figure it all out.

Now the point is, the model is unified, and it is very powerful. So when modelling a new construction in a language you define it in terms of CPS.

The reason exception handling is screwed is that it requires dynamic typing. A good construction simply has two continuations. Or more if you have more than one exit. Of course the error continuation is your exception handler, and it takes an argument, namely the type you would throw.

The difference to EH is that it cannot fail. You cannot fail to catch the exception you threw because you didn't throw an exception, you invoked a continuation, so you have to actually specify it.

Now, now one wants to write code in continuation passing style directly. So lets suppose we add some sugar. If we have a function accepting unit and calling a continuation that accepts unit, lets introduce a right associative binary operator semicolon ";"

lhs ; rhs -> lhs () (rhs ())

Note that rhs is "the rest of the program". So now we have defined that

set (x, 1);
print x;
print "bye";
halt

is actually written (with our sugar) in CPS. in other words procedural code is actually purely functional.

Haskell does this, with "do" notation. Because monadic bind operator is precisely continuation passing with an ugly constraint: the arguments of the functions bind chains together have to all be of the monad type. For "do" notation its the state monad.

So imperative code is already much better than Haskell and it is already purely functional. People writing that kind of code can write good code that runs much faster with less fuss than in Haskell, but their code is still purely functional and referentially transparent underneath after translating to CPS.

Anyhow the point again, you can add things like "call f x" as shorthand too, and in the function define "return" to call the implicitly passed in continuation. In that model if you have an error continuation you have to pass it in explicitly, and call it explicitly, but after desugaring the code it is symmetric: the error continuation is handled exactly the same as the normal one.

In other words the whole problem is one of finding nice syntax. There are no alternatives as to the underlying machinery: it all works by continuation passing, full stop.

sighoya commented 6 years ago

This isn't relevant. We're not doing a run time comparison of the functions, we're examining the mathematical definition of a product. In fact, the whole theory of abstraction, category theory, is an equational theory of function composition. Equational meaning, the only question one can ask about functions is whether they're equal.

It isn't irrelevant, higher order unification is undecidable, too.

Anyhow, there is hope. There's good paper "Symmetric Lambda Calculus" which shows how to symmetrically integrate values, functions, and continuations.

Is it this one?

The problem with allowing effects in expressions/function is that the order and interaction of the effects is indeterminate, so it is a bad model for managing effects, whereas explicit control flow constructions are a good model. The indeterminism is ESSENTIAL because it is the only way the compiler can optimise the code to produce things that run fast.

Felix aims to run code at HYPERLIGHT SPEED. In other words FASTER THAN C. :-)

This is interesting, but why tracking effects should be faster, is there any evidence for. For me, tracking effects and eventually overwriting them is a performance penalty.

Haskell does this, with "do" notation. Because monadic bind operator is precisely continuation passing with an ugly constraint: the arguments of the functions bind chains together have to all be of the monad type. For "do" notation its the state monad.

I don't like it, too. In my opinion it is even contradictory, as they use the '<-' also for list comprehension which is not a monad instance, right? Furthermore, monads do not compose and it sucks to instantiate typeclasses in order to get them work.

One question, does felix pass continuations at compile time?

skaller commented 6 years ago

Actually i haven't seen that paper. I'm referring to Filinski's Masters Thesis. Its a bit hard to find.

The point about categorical definitions is we have to use them as a model because they form a coherent algebra and theory of abstraction. If you do like C++ templates, you never know what the result will be .. although they work surprising well and in fact are much more powerful than Haskell or Ocaml or anything else. They're just not type safe ;( And template meta-programming is an arcane and ridiculous art form.

Not sure what you mean by passing continuations at compile time. The fibres (coroutines) are unknown to the compiler, its all done dynamically by the run time library.

Ideally, the compiler could optimise certain classes of configurations, but it was a major struggle to get enough ordinary stuff working to implement the coroutines. Ordinary functional and procedural code is heavily optimised by the Felix compiler, which generates C++, which is then heavily optimised by your system C++ compiler.

So it runs pretty fast, roughly the same as C or C++, but you get high level static typing and a few other goodies (like user defined grammar, coroutines, garbage collection).

Actually working on the kinding system at the moment.

skaller commented 6 years ago

Oh, if you mean does the compiler work by using Continuation Passing Style, the answer is no, its a conventional Algol like language, the internal representation is a high level machine with executable instructions that take expression arguments. Instructions are procedural, the expressions are functional. So we have both full on functional programming AND explicit control flow. The same as all real computers :-)

There is an essential separation of procedures and functions in the back end. This arises because Felix generates C++ and uses the C++ object model.

Functions use the machine stack because C++ does. It has to for iteroperability with C/C++ code which is a design requirement.

Procedures are all actually coroutines and use heap allocated "stack" frames, linked by pointers, which form a "spaghetti stack".

Each procedure in principle is a C++ class with a resume() method that allows a piece of the procedure to execute, then it yields to the scheduler by returning from the resume() method. The suspended procedure object is a continuation. To continue on, the scheduler calls resume() again.

This machinery works by returning control to the scheduler, and so procedure must be flat, that is, they cannot use the machine stack. Functions on the other hand must use the machine stack.

Note that Felix code does not translate one to one to this mode, the code is heavily optimised. Inlining is king. Also functions can be translated into procedures, and functions and procedures can use C++ class objects on the heap, on the stack, Felix can also optimise to C functions, and it can optimise them away entirely by inlining.

Also some of the big performance gains come from "indeterminate evaluation strategy", which is the default. This means function and procedure calls can be evaluated either eagerly or lazily, depending on what the compiler thinks is faster. Inlining is typically lazy evaluation. Closures are usually evaluated eagerly.

Felix follows the C++ philosophy: we like things to be safe, but if there is a choice between safe and fast, the system usually goes for fast, but provides ways for the programmer to choose.

Sometimes, type safe MEANS fast. For example, Felix static (compile time) arrays do not require bounds checking because the index is a unitsum type of the same size as the array, so if the code type checks, the array access is safe.

skaller commented 6 years ago

BTW: undecidability is an interesting issue that is not well understood.

First, although in general something is undecidable, it can often be made practically decidable.

For example: in general, it is not decidable whether a program terminates. This is irrelevant because, when I write a program that is intended to terminate it had dang well better be decidable or its a BUG. When writing the code I reason that this part or that part will in fact terminate and with the intention that it can, indeed, be decided, and what's more I sketch a proof where it isn't so easy to reason. In other words, programmers write a type of code where either termination is effectively proven or their code is bugged. This means in principle a mechanical prover can be used to verify termination, bound by a time limit so that it fails if the proof cannot be constructed. So now, we have a decision procedure. Its decidable.

We have not broken the maths here, we have changed the question from "does the program terminate" to "can it be proven to terminate with an effective, bounded procedure" and that question is decidable.

In the case of functions, this issue arises in programming languages. The famous case is in C++ where they have the "One Definition Rule". This is an effective procedure for deciding if two functions or types are equal. The rules says they're equal if, and only if, the definition is the same definition in a source file.

In Felix, I check if two types are equal by structural equality. Types in Felix can contain Lambdas. They're equal if and only if they're structurally equal, modulo alpha conversion. This rule is good enough because it handles the cases that occur in practice, namely where the type is physically copied by the compiler itself, or, where it is destructured by pattern matching (inside the compier I mean). In fact getting this to work is a bit tricky even with alpha-conversion because one must ensure both expressions are irreduceable, especially with respect to folding and unfolding recursions.

It is certainly possible the types have alternate representations that are not handled but reduction rules try to ensure a unique representation. In fact, after monomorphisation, we have uniqueness but not with polymorphic types, because of course a polymorphic type is a functor.

So in reality, the situation IS decidable conservatively: if you cannot decide, you decide false. This is always obtainable by putting resource bounds on the decision procedure (i.e time and memory limits).

One would note that, even in case where in principle the case is decidable, the resource limits case the wrong result to be chosen, namely false. So the fact other cases which are in principle undecidable return false as well is not really any different!