Getting it right: Procedures without interesting return values

mnieper commented 4 years ago

In R7RS Scheme, procedures that are evaluated solely for their side effects usually return an (undefined) value. This behavior is for compatibility with earlier standards but most code that actually relies on such return values is buggy in all probability.

Should such impure functions be incorporated into Scheme (e.g. through the effects framework of #19), I would like to impose that they return no values. We do have less compatibility conditions to take care of, and we want to enable the type system to catch misuses of impure functions (i.e. when we rely on what they return).

A special case is true continuations like raise (when viewed as an impure procedure), which do not return. In this case, the return type is not the unit type (corresponding to zero values) but the bottom type.

johnwcowan commented 4 years ago

+1

lassik commented 4 years ago

In R7RS Scheme, procedures that are evaluated solely for their side effects usually return an (undefined) value. This behavior is for compatibility with earlier standards but most code that actually relies on such return values is buggy in all probability.

Indeed, since there is no official "undefined" object in RnRS, there are sometimes bugs (or at least surprises) where different Scheme implementations return different values from standard functions that don't have a meaningful return value. A particularly fun gotcha is that a (begin) with an empty body can return different values, e.g. the last result in a VM register.

Should such impure functions be incorporated into Scheme (e.g. through the effects framework of #19), I would like to impose that they return no values.

Whether we have real multiple values, or use tuples as an analog, this "uninteresting" return value is naturally written as (values).

A special case is true continuations like raise (when viewed as an impure procedure), which do not return. In this case, the return type is not the unit type (corresponding to zero values) but the bottom type.

Koka has div (divergence) and exn (exception) types. The exception type doesn't specify which kinds of exceptions are involved. Harper has a rationale for why that is a good thing.

lassik commented 4 years ago

Even some Common Lisp programmers recommend using (values) for the uninteresting return value BTW.

mnieper commented 4 years ago

@lassik A begin with no expressions is at least an error by R7RS. The closest thing to "no expression" you can get is a one-armed if, e.g. (if #f #f).

In the static language we can ban the one-armed if because it cannot in general be meaningfully typed unless we add an "unspecified" value to every type. A when and or an unless can be made possible when we enforce the unit type (the terminal object, the empty product, zero values, name it as you want) as the return type of the enclosed expression. We would then have the macro expansion:

(when <condition> <action>) ==> (if <condition> <action> (values))

A special case is true continuations like raise (when viewed as an impure procedure), which do not return. In this case, the return type is not the unit type (corresponding to zero values) but the bottom type.

Koka has div (divergence) and exn (exception) types. The exception type doesn't specify which kinds of exceptions are involved. Harper has a rationale for why that is a good thing.

I think you are confusing two things here. The exception type by which a function arrow is annotated has, a priori, nothing to do with the domain and codomain of the function arrow. So the exception annotation comes on top of raise: A -> B.

PS: A classical paper that may be related to the point you raised may be Unchecked Exceptions can be Strictly MorePowerful than Call/CC by Mark Lillibridge.

johnwcowan commented 4 years ago

A begin with no expressions is at least an error by R7RS. The closest thing to "no expression" you can get is a one-armed if, e.g. (if #f #f).

Some Schemes support (begin) as an equivalent to the one-armed if; that is, they return a single unspecified value.

In the static language we can ban the one-armed if because it cannot in general be meaningfully typed unless we add an "unspecified" value to every type.

+1. Racket allows it only in a context where the continuation discards what is passed to it: asking the REPL to evaluate (if (pred x) 32) reports a syntax error.

A when and or an unless can be made possible when we enforce the unit type (the terminal object, the empty product, zero values, name it as you want) as the return type of the enclosed expression.

The when and unless constructions are inherently imperative. In R6RS they return an unspecified value when the guard is unsatisfied, and the last value when the guard is satisfied. But since the last value might be the same as the value that the implementation chooses as "unspecified" in this case, there is no point in examining the result, and so R7RS makes the result unspecified in all cases. Clearly R6RS was expecting a macro expansion of (when guard . exprs) to (if guard (begin . exprs)), which is precisely the one-armedif` we don't want.

In short, I propose flushing begin, when, until, and one-armed if from Steme.

johnwcowan commented 4 years ago

(when <condition> <action>) ==> (if <condition> <action> (values))

I'm not happy with the idea of procedures or procedure-like macros that are polymorphic in the number of their return values. No statically typed language is polymorphic in the size of tuples either, which is relevant if we adopt the "tuples are multiple values (reified when necessary as mv-boxes" assumption. Fortunately, as far as I know no procedure in R[567]RS or the SRFIs works like this unless it is one of those "returns the values that foo returns" where foo is some explicit or implicit argument.

raise-continuably is an example of such a procedure: it returns the values the handler returns. So all calls on it will have to use our local typing construct that allows a Scheme procedure to be called by Steme via fixing its types at the point of call. Currently this is (the Integer (foo (the Integer a) (the Integer b)), which has the advantage of keeping the argument types close to the arguments, as opposed to (the (-> (Integer Integer) (Integer) (foo a b)), though we can have both. Of course to call raise-continuably the programmer has a proof obligation of purity. ' The remaining case is procedures that return a number of values dictated by their arguments. If the values are all of the same typeSteme can call these and be told that the Steme type is (List a) for some a. In order to do this, we will need some general way to communicate fine-grained type information to Scheme when Steme brushes it under the rug. Another example is allowing characters to be a kind of String in Steme; we need some way, when specifying the type of a procedure expecting a character, that it gets one or signals an error; Steme would write a wrapper to do this. Formally, Char would be a synonym for String, but the Steme-to-Scheme calling mechanism would understand the difference.

lassik commented 4 years ago

In the static language we can ban the one-armed if because it cannot in general be meaningfully typed unless we add an "unspecified" value to every type. A when and or an unless can be made possible when we enforce the unit type (the terminal object, the empty product, zero values, name it as you want) as the return type of the enclosed expression. We would then have the macro expansion:

+1

I'm not happy with the idea of procedures or procedure-like macros that are polymorphic in the number of their return values. No statically typed language is polymorphic in the size of tuples either, which is relevant if we adopt the "tuples are multiple values (reified when necessary as mv-boxes" assumption.

+1. I'd be perfectly happy merging tuples and multiple values into the same concept.

mnieper commented 4 years ago

The when and unless constructions are inherently imperative. In R6RS they return an unspecified value when the guard is unsatisfied, and the last value when the guard is satisfied. But since the last value might be the same as the value that the implementation chooses as "unspecified" in this case, there is no point in examining the result, and so R7RS makes the result unspecified in all cases. Clearly R6RS was expecting a macro expansion of (when guard . exprs) to (if guard (begin . exprs)), which is precisely the one-armed if` we don't want.

I agree that relying on any value that when or unless returns is a mistake.

In short, I propose flushing begin, when, until, and one-armed if from Steme.

We should flush begin, when, and unless only when we have no side effects, including exceptions, at all. Otherwise, they should remain suitably adapted to the type system, I would say.

(when ) ==> (if (values))

I'm not happy with the idea of procedures or procedure-like macros that are polymorphic in the number of their return values. No statically typed language is polymorphic in the size of tuples either, which is relevant if we adopt the "tuples are multiple values (reified when necessary as mv-boxes" assumption. Fortunately, as far as I know no procedure in R[567]RS or the SRFIs works like this unless it is one of those "returns the values that foo returns" where foo is some explicit or implicit argument.

I neither would be happy with the idea. But that's not a problem of the above macro expansion. As the values branch is of type Unit, the other branch has to be as well in order to give the if expression a type. Thus, action has to be of type unit as well, which makes perfect sense for something that is solely evaluated for its side-effect.

raise-continuably is an example of such a procedure: it returns the values the handler returns. So all calls on it will have to use our local typing construct that allows a Scheme procedure to be called by Steme via fixing its types at the point of call. Currently this is (the Integer (foo (the Integer a) (the Integer b)), which has the advantage of keeping the argument types close to the arguments, as opposed to (the (-> (Integer Integer) (Integer) (foo a b)), though we can have both. Of course to call raise-continuably the programmer has a proof obligation of purity.

Yes, at each use-site, raise-continuable has to have a determined type. If we want the handling being done in Steme, this means that the handler either has to be polymorphic in the return type (hard to think of a meaningful handler here) or that we have handlers parameterized by the return (and the exception) type.

pre-srfi / static-scheme

Getting it right: Procedures without interesting return values #21