Impure procedures in Steme

johnwcowan commented 3 years ago

I want to explore the idea of Steme having a way of marking procedures as impure. A simple syntax would be to introduce impure-lambda and its counterpart impure-define. In impure procedures, most things that can be done in Scheme can also be done, though not all (see #16). This allows people to write ML-ish code with full type checking but Turing complete.

An impure procedure would be callable from a pure procedure if the programmer certifies it secure, just as a Scheme procedure has to be. However, it is not necessary to specify argument and result types for it. An obvious question: why even allow impure procedures to be certified pure?

An example that comes to mind is a procedure that takes a list of non-negative integers and an upper bound on their values, and returns a histogram of the occurrence of each integer from 0 to the bound. The simplest way to do this is to allocate a mutable vector, increment the appropriate element while processing the input list, and return the result of vector->list.

A less efficient alternative is to use a persistent vector. It is also possible for the compiler to be smart enough to figure out that the vector never escapes, and quietly substitute the O(1) implementation. But both of these seem difficult compared to marking the procedure as impure.

I have not fully read the Linear Haskell paper and may not be able to, but it is based on using a "linear arrow" alongside the normal function arrow. (By "arrow" I mean the -> used in type declarations, not Kleisli arrows.) This seems to align well with using a different function constructor like impure-lambda.

mnieper commented 3 years ago

In Haskell, you have unsafePerformIO to achieve exactly what you have in mind.

I agree very much that we should find a way so that for the histogram problem, Steme can actually use a non-persistent vector. Using a persistent vector would really be the wrong tool for the job.

How to achieve this, is less clear. Something like unsafePerformIO is the least attractive. Linear types, uniqueness types, monads, or algebraic effects would be much better solutions because we could still count on some guarantees of the type system. I have to read more, though, to make up my mind.

mnieper commented 3 years ago

PS Note the problems of unsafePerformIO here: http://hackage.haskell.org/package/base-4.14.0.0/docs/System-IO-Unsafe.html. We have to see whether a more limited approach doesn't have these problems.

mnieper commented 3 years ago

See also #19 for a possible (fine-grained) solution.

mnieper commented 3 years ago

An important thing we should keep in mind is how very basic things like division by zero is handled. There are a number of approaches, some touching impurity, which we are discussing here (comments are my personal opinion, of course):

Let 1/0 return some random number, say, 0, just to be silent. --- That's an extremely bad option. It doesn't make sense semantically and it can easily hide bugs.
Return a Maybe or an Either. --- Semantically a good option, but technically a bad one because it will like slow down general arithmetic code and make programming with the division operator hard. Moreover, how to propagate the Nothing case?
Silently raise an exception. --- This is Haskell's approach but this means that every type will be inhabited by Undefined making the (categorical) semantics of the language badly behaved. (http://math.andrej.com/2016/08/06/hask-is-not-a-category/). [*]
Raise an exception and thus allow most procedures (at least those including divisions) to be impure in the naive sense. --- This looks like the best approach to me once we have agreed upon how to catch and confine the impureness.

-- [*] Maybe this point is minor because we cannot exclude non-termination in the presence of letrec.

lassik commented 3 years ago

Good observations. Perhaps there should be two division operators: one that raises an exception, and one that returns a maybe/either.

lassik commented 3 years ago

There could even be a fast "divide, default to zero on error" procedure separately.

mnieper commented 3 years ago

There could even be a fast "divide, default to zero on error" procedure separately.

I would prefer not to include such mathematical nonsense in the spec unless absolutely needed.

mnieper commented 3 years ago

I have thought a bit more about non-terminating functions and how to reason about them. For this, I checked what others do. The approach of the functional language Koka seems to be interesting. There a function can be marked as total or as pure, for example. A total function is the best we can get and @johnwcowan thought about restricting to total functions in one thread. On the other hand, a pure function is one that is pure in the Haskell sense of the word: Only divergence (non-termination) and throwing of exceptions is allowed. Division would be pure but not total and so will be every function calling (directly or indirectly) division (unless the type system can proof that division by zero won't occur). Such a function could be made total again by wrapping it into an exception handler.

johnwcowan commented 3 years ago

Silently raise an exception. --- This is Haskell's approach but this means that every type will be inhabited by Undefined making the (categorical) semantics of the language badly behaved. (http://math.andrej.com/2016/08/06/hask-is-not-a-category/).

[*] Maybe this point is minor because we cannot exclude non-termination in the presence of letrec.

Or top-level define, which is equivalent to a giant top-level letrec.

Another example would be that the division pipeline is happening on one core and the result is wanted by another core, and suddenly the first core (or the connection between them) fails. "No procedure terminates if the computer catches on fire."

pre-srfi / static-scheme

Impure procedures in Steme #20