keean / zenscript

A trait based language that compiles to JavaScript
MIT License
42 stars 7 forks source link

Optionally declare functions `pure`? #4

Open shelby3 opened 7 years ago

shelby3 commented 7 years ago

@shelby3 wrote:

Also, the compiler can use a much simpler form of path analysis (independent of run-time state) to do that performance optimization without needing dependent typing (although without purity it becomes more difficult for the compiler to know what is safe to reorder).

Purity provides more opportunities for optimization.

keean commented 7 years ago

I would rather have monadic types for impurity. In this way we have a sound type system that can be expanded for full effects support. I would like to have inferred monadic side-effects. The inference means the average user does not need to bother or understand them, but you can use type classes to constrain the types to achieve things like this function must not do IO, or must not allocate memory etc.

If all the primitive operations have the correct mondaic types assigned to them, purity becomes constraining a type to have no side effects. I haven't really thought about the effect syntax, nor how a constraint on effects would look syntactically, but this achieves what you want with more fine-grained control about which side-effects to permit.

SimonMeskens commented 7 years ago

If operations are correctly typed and maybe with some syntactic spitshine, we can make monadic types look very simple, I'm all for just going with monadic types. We can probably make it so the user doesn't even realize he's using monadic types.

keean commented 7 years ago

Lets assume we have monadic types, and that all functions are in some monad (as we want to allow imperative style statements anywhere). Further lets assume we have a monadic effect system, with effect inference. As such effects will propagate from the function use site to the definition (the effects on a function being the union of all effects in functions used in the function definition).

As long as primitive functions get the correct types, it should all work without the user even knowing. They only need to know if they wish to constrain effects, probably with some kind of type class. In order to do this the type system will need to provide some a 'set' kind of type, but I think we need that for union types already.

Having to import functions from JavaScript and give them a type seems like some boilerplate, but I don't think there is any other way. For example a simple primitive 'print' type function would have a signature like:

print(x : String) : [write_stdout] ()

A primitive input function would type like this:

input() : [read_stdin] String

Now you can use them like this:

doit = (prompt) =>
    print(prompt)
    return input()

The type of doit can be inferred, however you can give optional type annotation like this: doit = (prompt : String) => doit = (prompt) : String => doit = (prompt : String) : String => doit = (prompt : String) : [write_stdout, read_stdin] String => All the above would be valid, but the type below would not: doit = (prompt : String) : [] String => The empty brackets would force the function to be pure, and of course the definition is not, so it should fail type-checking.

So the assumption is all functions are in the effect monad, and therefore we can omit the effect constructor '[]' from the type signatures of functions.

SimonMeskens commented 7 years ago

I like that. A lot.

shelby3 commented 7 years ago

My understanding of a monad in this context it provides composition of functions that mutate some state (i.e. input and output the state type) with those that don't. The advantage over compiler-assisted access to globals, is we can have different state objects (e.g. different implementation of print) for different code paths. And I believe we can even compose monads in some cases, but they are less composable than Applicatives. I like the idea of fine-grained access to resources, as this also has security applications, e.g. Android apps need to declare which resources they access.

But separately we would still need a way to declare a function's inputs arguments and constructed references (collectively or individually) deeply immutable (which is not the same as JavaScript's const which merely prevents reassignment of the reference).

keean commented 7 years ago

Technically in an imperative language where any statement can perform IO to any external service, every function is in a monad :-) The threaded state is the 'world' and return is the unit function, and the statement separator (newline or ;) is bind.

shelby3 commented 7 years ago

We need purity declarations, otherwise we model values separately from functions, which is a non-unified construction.

Note that the purity declaration for an immutable reference and a zero argument function can be unified to let.

keean commented 7 years ago

Then we get an arbitrary split, some functions are pure and some functions are impure. I can only call pure functions from pure functions but I can call either from impure functions. This is how Haskell works and it as a real problem for a lot of people to work with. You want to put a print statement in a pure function somewhere and all of a sudden you have to rewrite half the program.

If you want an easy to use language that Java programmers will be able to use, we cannot do this. This would make the language feel more like Haskell than any of the type system features I want.

I think we want all functions to be impure and in the IO monad. We can then use constraints to prevent side effects where we don't want them, but the functions will still be impure.

The separation between function reference and result value is very important.

shelby3 commented 7 years ago

@keean wrote:

You want to put a print statement in a pure function somewhere and all of a sudden you have to rewrite half the program.

I thought you were planning to offer treading global state through pure functions with an implicit/inferred monadic system. So IO would be in the monad. Same as Haskell does it.

Seems to me we can't do your monadic idea unless we do it every where. But am I correct it will only add overhead to calls to functions which access the monadic state or will the performance of every function call suffer?

I think we want all functions to be impure and in the IO monad. We can then use constraints to prevent side effects where we don't want them, but the functions will still be impure.

If we are forced to have your monadic system every where, then why should all functions be impure and in the IO monad by default?

The separation between function reference and result value is very important.

I don't understand what you do you mean?

shelby3 commented 7 years ago

If we will have pure annotations on functions, then we also need const annotations on input arguments for impure functions that only call pure functions on those const inputs.

shelby3 commented 7 years ago

@keean seems to me silly that for example a length function will ever need to call print. That is what we have debuggers for.

keean commented 7 years ago

Seems to me we can't do your monadic idea unless we do it every where. But am I correct it will only add overhead to calls to functions which access the monadic state or will the performance of every function call suffer?

There is zero implementation overhead, it is simply a type system change. I have pointed out before you can give 'C' functions a monadic type, do the type checking and then compile it using the normal C compilation method.

keean commented 7 years ago

If we are forced to have your monadic system every where, then why should all functions be impure and in the IO monad by default?

Because we may want to change the function to include IO without having to change all the module type signatures.

I don't understand what you do you mean?

A program is a bunch of symbols that is executed to produce a result. It is important we can pass around the unexecuted function and then execute when we want. I think it is important that the notation is consistent no matter how many arguments a function has. Making an exception for zero argument functions is introducing special case inconsistent notation.

seems to me silly that for example a length function will ever need to call print. That is what we have debuggers for.

Debuggers should not make up for inadequacies of the language. Programs are rarely written in their final form and the ability to easily refactor is important.

shelby3 commented 7 years ago

@keean wrote:

Because we may want to change the function to include IO without having to change all the module type signatures.

I don't understand. What do modules have to do with it? I probably need to see a code example.

A program is a bunch of symbols that is executed to produce a result. It is important we can pass around the unexecuted function and then execute when we want. I think it is important that the notation is consistent no matter how many arguments a function has. Making an exception for zero argument functions is introducing special case inconsistent notation.

Afaics, syntax of zero argument functions is orthogonal to a purity annotation decision and discussion. @skaller and I enumerated optimizations and benefits of purity. For example also the following also applies to non-zero argument pure functions:

@shelby3 wrote:

The following LOC would be a compiler error, because the result value of pure functions must be assigned:

size


Debuggers should not make up for inadequacies of the language.

What inadequacy? Purity annotation is an optional feature.

shelby3 commented 7 years ago

Readers please note that certain side-effects are not side-effects in our type system. More discussion about the relationship to IO monad.

You had disagreed with me about certain side-effects being inapplicable, but I (and @skaller) argued I think convincingly that the only side-effects that matter are those that our type system guarantees.

If we pass around the IO monad as external state, then every print(x) will be an "impure" function w.r.t. to every function that also accesses the IO monad but if a new copy of the IO monad is returned instead of being mutated, then those functions are still pure functions because they don't modify external state nor depend on any state not provided by the inputs. This why Haskell is still pure even though it has side-effects to IO monad, because this IO monad state is a state which is passed along the function hierachy and so the ordering is explicit. That effects are semantically imperative (meaning that changing something in one function could affect expected semantics in some function far away, i.e. the locality of semantics is desirably forsaken), but the functions are operationally (operational semantics) pure and can be optimized as pure.

I wrote:

We can’t model everything with types, thus a typing system is always inconsistent (or otherwise stated can’t model all forms of errors nor allow all algorithms).

Thus even in a statically typed program, we must have dynamically typed features either explicitly or adhoc. So to characterize a program as unityped doesn’t mean it has less expressive power, only that it has less statically expressive power.

shelby3 commented 7 years ago

@keean wrote:

I don't think we need purity anyway, just side-effect control.

What do we gain by marking a function pure?

@skaller and I already explained that. I also explained that even a non-zero argument pure function can't be invoked without assigned the result value, so the compiler can catch this bug:

The following LOC would be a compiler error, because the result value of pure functions must be assigned:

size

keean commented 7 years ago

@shelby3 okay well I think a zero argument function should be executed... it is not the same things as a value. Scala got this wrong :-)

shelby3 commented 7 years ago

@keean there is no reason to execute a pure zero argument function. That is nonsense.

And there is no reason to execute any arity pure function if the result value is not assigned.

keean commented 7 years ago

@shelby3 It may fail due to lack of memory. A value is a value like '4' or '7'. A function has a value which is the set of symbols for example "4 + 7". Before you execute it is is "4 + 7" when you execute it, it returns "11". If it is a const function the compiler may optimise by executing it at compile-time. (By the way @skaller was saying the same thing).

To me it seems you want to save typing two characters () and that you are willing to mess the semantics up to do it.

keean commented 7 years ago

refresh again.

keean commented 7 years ago

We can analyse this further. The only time a pure function is really pure is when all the inputs are known at compile time (in which case you can treat it as a const). If any of the inputs to the function depend on IO then the pure function has to be lifted into the IO monad.

To put it another way, when we treat a function as pure and replace it with its value it means the input cannot change. If the inputs to a function change, its results will change to. You can see this in Haskell:

f x y = x + y    {- pure function -}

g : IO Int   {- impure function does IO -}
g = do
    x <- readInt
    y <- readInt
    z <- return (f x y)     {- lift f into the IO monad -}
    return z

If we don't do this the program will go wrong, because the compiler will not know that the result of 'f' can change with time, and it will replace it with a single value that won't change even if we run g again and the user inputs different values, once f is replaced by a value it will always be the same value.

keean commented 7 years ago

Of course if a function has no side effects, and no arguments, it must be a const value (IE the compiler can replace the function with its value at compile time). If this is the case you can declare it as a value instead, just do:

let x = 3 * 5 + 8

It seems there is no need to ever have a pure function with no arguments. The only useful zero argument functions have side-effects :-)

shelby3 commented 7 years ago

@keean don't forget that (which I pointed out before) a function which takes a callback of a zero argument function, would also accept a pure such function. And this is another reason to treat const values and pure zero argument functions as equivalent and also pass them as callback for (pure and impure) zero argument functions.

shelby3 commented 7 years ago

@keean wrote:

It may fail due to lack of memory.

...

The only time a pure function is really pure is when all the inputs are known at compile time

I already linked to were @skaller and I refuted that entire line of reasoning. I am not going to repeat it.

To put it another way, when we treat a function as pure and replace it with its value it means the input cannot change. If the inputs to a function change, its results will change to. You can see this in Haskell

You are conflating two orthogonal concepts. The value of the pure function never changes and no side-effects occur (that are modeled in the type system). The inputs select which element of the (lazily memoized) set of the function's value is presented.

An impure function isn't guaranteed to have the same result for any fixed input, i.e. the set of results that is the function's value is not constant.

To me it seems you want to save typing two characters () and that you are willing to mess the semantics up to do it.

No I want a system that is complete, consistent, and not corner cases. It appears to me sometimes you seem to want consistency by not being complete, which is a form of obscured inconsistency.

skaller commented 7 years ago

To make progress I suggest implementing basic stuff, when in doubt leave it out until there are actual uses cases to consider. So, do not put "pure" yet. Do not have => for functions and -> for functions with IO monad, yet.

These forums are not the best for technical discussions IMHO. For a start, you can't reply to someone as you can with email. We need an email list which tracks commits. I can write some test cases, but i would need commit access and a fairly clear specification of what is implemented and what it should do. As soon as possible, I think an FFI is required, and then a place to put a library of bindings to Javascript. Most of the work you can do with the language can then be done almost immediately by adding the required bindings and writing Javascript to bind to.

No matter what you think the advantages of Zen-fu are the primary one is the ability to leverage existing code, but access it in a better structured way (eg with type checking).

One thing I learned on the C++ committee is that the library drives the language. You add stuff you need to make the library work. So for example Felix has a restricted kind of constrained polymorphism where you can define a type set (a finite list of types), and constrain a type variable to that list. Why? Because that's whats needed to define addition of all 20 types of integers C++/Posix require in one line. (Just an example).

keean commented 7 years ago

@shelby3 regarding monads and impurity: if we start with all functions being pure, then we need a way to manage impurity. If we consider an arrow is just a binary type constructor, we can take a type like this:

A  => B -> C

Which would be a pure function that returns an impure function, and rewrite this:

A -> B -> M<C>

Where an arrow followed by a special type constructor M represents the impure action. In the above we are using -> to represent pure functions, and we can see the impurity represented by the extra constructor M.

At the basic level that's all there is to it. There are some subtle details, like the monad, as it has no decinstructor enforces that impure can call pure (via lifting) but not the other way around.

In this basic categorisation we can have a single monad which encapsulates all impurity, which Haskell calls IO but we could call whatever we like. We can then get into making more fine grained distinctions, but let's see how we get on with the basic 'impure' monad.

shelby3 commented 7 years ago

@keean during my incomplete (and possibly some mistakes) research for the language Copute which I was working on in 2011, I did write down my understanding of a State monad:

http://www.coolpage.com/copute/docs/

See also the example in Traversable and the conceptual description of imperative programming in Event.

So it seems what you want to do is as I quoted below?

@shelby3 wrote:

Rationale

Makes the computation of state orthogonal to the state transitions of unlifted functions. Thus generally any kinds of orthogonal state can composed with any category of morphisms, without case-specific boilerplate.

The State (a Monad) eliminates the boilerplate of threading some state across functions that don't access that state so as to compose these with functions that do access that state. This allows reuse of functions without having to code separate functions for each permutation of state that one might one to compose with. The State monad is roughly analogous to a threadable scope for global variables, i.e. that state is encapsulated, so that state is only accessible to functions in the composition, and thus the functions are pure w.r.t. that state. Thus the state effects are transparent (the functions operate on that state via inputs and return value only), delimited, and composable.

Event.html.zip State.html.zip Traversable.html.zip

keean commented 7 years ago

So it seems to what you want to do is as I stated there?

A state monad only stores a specific bit of state. In the case of an impure function the monad represents all the impurity. Perhaps we could call it the World monad.

keean commented 7 years ago

Some further details:

Using the State monad you can construct a state machine. Here the state is in the monad, an pure state transformation functions link one state to another on the clock tick (event update). With the state monad if you have multiple state machines, they can each have their own state space that is separate. This manages the combinational complexity blow-up when combining state machines (as we know for certain that the private state of state-machine one cannot affect state-machine two etc).

This leads to a problem with monads if you want to write a program that is in two monads at once (IE links two state machines together). The direction Haskell took was to have a "World State" monad called IO that conceptually stores the state of the world. As such IO is reading and writing state from the World state. This allows multiple state elements to exist in the same monad (the world monad). This is initially what I am proposing here, that we have one "built-in" monad. It does not need special treatment by the type system (which is one of the reasons this is considered a function model of impurity).

This is all we need to get the same functionality as the pure/impure arrows but in a more general way, as we can go further with this concept. For example we can solve the problem with state monads using monad transformers (the way Haskell does it), but this is a poor solution in my opinion and lead me to waste much time wrangling with the type system to understand and implement various seemingly obvious things. A better solution is a set monad where we can 'or' together multiple monads a bit like a union. But first lets decide if we want to use this model of impurity. On the whole type inference will make this painless for example:

f(x : Int) => 1 + 2

Would have an inferred type of Int -> Int, however

f(x : Int) =>
    print(x)
    1 + 2

Would have an inferred type of Int -> World<Int>. If we decide to call our "world state" monad World.

shelby3 commented 7 years ago

I have found the following helpful for me to better understand the essence of a Monad:

http://stackoverflow.com/questions/44965/what-is-a-monad/10245311#answer-10245311 http://stackoverflow.com/questions/2704652/monad-in-plain-english-for-the-oop-programmer-with-no-fp-background/2704795#answer-2704795 https://blog.jcoglan.com/2011/03/05/translation-from-haskell-to-javascript-of-selected-portions-of-the-best-introduction-to-monads-ive-ever-read/ http://blog.sigfpe.com/2007/11/io-monad-for-people-who-simply-dont.html

Thus, I understand that one use-case of a Monad is to enable the composition of functions which operate on some state with those that don't, and the former can call the latter but not vice versa. This can enable us to tag pure functions which for example access certain genres of state, such as the input stream since these functions will input and output the said stream state. Every time we modify this state we must make a new copy of it, else our caller's copy would be modified making our function not pure. So really the Monad has nothing to do with what enables us to access the said state with a pure function. The Monad is only enabling the additional said composition.

Thus this seems to do nothing to enable general imperative programming. For example, it doesn't enable us to operate on arrays.

Sorry I think we will also need impure functions. I never intended to create a programming language that forced purity and immutability where it is more efficient to use imperativity and mutability. Pure functions are not always the optimum choice. I've read they have a log n factor slower performance in general, in addition to making some algorithms obtuse. Where purity fits, it can be useful, but it shouldn't be forced every where.

keean commented 7 years ago

Every time we modify this state we must make a new copy of it, else our caller's copy would be modified making our function not pure.

No this is not right. A monad is a type system concept. We can type check C code using monads, and then compile using the normal C compiler for absolutely zero performance cost :-)

keean commented 7 years ago

Thus this seems to do nothing to enable general imperative programming. For example, it doesn't enable us to operate on arrays.

This misses the point, using a monad to represent impurity can be seen a simply a syntactic operation on types. For example:

Starting with

(A) => (B) -> C

A pure function that returns an impure function (using the arrows as you prefer them), we can apply the following substitution:

(a) -> b  ==>   (a) => W<b>

Using lower case letters for the substitution "meta-variables". Applying this to the above we get:

(A) => (B) => W<C>

Everything about the things you can do in the impure part of the function remains the same as with the impure-arrow, except we get some nice additional properties:

So a monad is a closer model for impurity in functions than a different arrow, and there are no performance costs, because this only changes the types assigned to code, not the code itself.

Sorry I think we will also need impure functions. I never intended to create a programming language that forced purity and immutability where it is more efficient to use imperativity and mutability. Pure functions are not always the optimum choice. I've read they have a log n factor slower performance in general, in addition to making some algorithms obtuse. Where purity fits, it can be useful, but it shouldn't be forced every where.

Because the assumptions you made are wrong, the conclusion is also wrong, although your logic is impeccable. Unfortunately none of the references you linked to really seem to understand what a monad is at a deep level.

shelby3 commented 7 years ago

No this is not right.

Disagree. You will break the entire point of purity and immutability otherwise. A pure function which inputs a IO must return a copy of the instance as the output IO. This can be made fairly efficient, as Haskell has apparently done. For example, the inner workings of the IO monad can use an internal buffer for storing indices into its buffered input stream.

This misses the point, using a monad to represent impurity can be seen a simply a syntactic operation on types.

Sorry no. That isn't correct. You still have to maintain the rules of purity of the inputs and outputs of pure functions.

The only way it is correct is if you are allowing these to be impure procedures. Which means you can no longer call these procedures from other pure functions.

And in that case, we have impure procedures, which is my point that we will have both pure functions and impure procedures.

It is possible that our difference is merely a miscommunication (as it usually is because you don't seem to understand me and I don't seem to understand you, sigh). I also find that you don't explain things in a very intuitive way. I don't find you to be a clear communicator. You seem to talk in a private language of types. Talk to me in English please.

If seems you want to allow impure procedures. And you want to infer them?

keean commented 7 years ago

The only way it is correct is if you are allowing these to be impure procedures. Which means you can no longer call these procedures from other pure functions.

That is correct, you cannot call an impure function/procedure from a pure one, and this is the very thing the monad enforces in the type system.

keean commented 7 years ago

And in that case, we have impure procedures, which is my point that we will have both pure functions and impure procedures.

Yes this is true, from a certain point of view.

shelby3 commented 7 years ago

So please try to communicate your paradigm shift view to me in a way I can understand. Writing type signatures that don't mean anything to me, doesn't communicate anything to me.

shelby3 commented 7 years ago

If you can't explain it to me, you have no hope of explaining to most of the JavaScript programmers.

shelby3 commented 7 years ago

This comment of yours is gibberish to me. It means absolutely nothing to me. I know exactly nothing more than I knew before trying to read it.

keean commented 7 years ago

This comment of yours is gibberish to me. It means absolutely nothing to me. I know exactly nothing more than I knew before trying to read it.

What is the best way to help you understand? I know you know what a type-system is, and I also know you understand about pure and impure arrows.

Consider that an arrow is just an infix binary type constructor, we can rename => in prefix form as Arrow<A, B> so what is special about the arrow?

keean commented 7 years ago

refresh

shelby3 commented 7 years ago

Explain it using a Java example perhaps?

Explain it using English instead of types and arrows.

Eric Lippert seems to be able to explain Monads in a way that is much easier to understand.

See the 100s of upvotes on the comments at the bottom of this SO answer, to understand the attributes of why your elucidation will fly right over the head of most people who are not Haskell programmers.

keean commented 7 years ago

Explain it using a Java example perhaps?

Writing monads in Java is very different from typing Java using monads :-)

shelby3 commented 7 years ago

Explain it using a Java example perhaps?

Writing monads in Java is very different from typing Java using monads :-)

I understand, but you'll have to learn how to understand your art as well as Eric Lippert does.

keean commented 7 years ago

See the 100s of upvotes on the comments at the bottom of this SO answer,

This explains how to use monads, it does not explain what a monad is.

keean commented 7 years ago

I understand, but you'll have to learn how to understand your art as well as Eric Lippert does

At this point should I get angry, and start complaining that I cannot work with you if you cant respect that I understand these things?

shelby3 commented 7 years ago

I didn't say you didn't understand these things. My point is that you if you want to communicate with those who don't think about types the way you do, then you need some way of explaining in English, which exhibits a mastery of understanding of the concepts and what the reader can comprehend. Eric Lippert is a master. You may understand your private language, but you may not yet be a master communicator of what you know and think.

keean commented 7 years ago

You may understand your private language, but you may not yet be a master communicator of what you know and think.

Well, that is a criticism I can take :-P I have to eat now, but I will see if I can come up with a better explanation.

The important point is to understand that the same program may have different types in different type systems.

shelby3 commented 7 years ago

Consider that an arrow is just a binary type constructor, we can rename => as Arrow so what is special about the arrow?

The relevance of this does not register for me. Also I don't know what 'binary' means in this context.

keean commented 7 years ago

The relevance of this does not register for me.

But you do get that Arrow<A, B> is no different from Widget<A, B> right. I don't want to get into a long explanation if we don't agree on the basics.

shelby3 commented 7 years ago

I could probably figure out what you are trying to communicate to me if I'm willing to give myself a headache. But I am thinking you should be able to make it more obvious for me?