Closed acl33 closed 2 years ago
There is something very unsatisfactory about imposing extra rules about floating things out of conditionals, as you rightly identify here.
One possible approach is to be explicit about evidence. Instead of
if (x <= n) (...index arr x...) (...blah...)
we might think of
case (x <= n) of
No -> blah
Yes (ev :: Proof) -> ....(index ev arr x) ...
If the test on (x <= n)
is successful, it yields a "proof object", here ev
that witnesses that claim. We pass that proof to index
. So now we can't float index ev arr x
out of the branch of the conditional, because then ev
won't be in scope.
(One can imagine more fancy things, where instead of a uniform Proof
type, we have Proof (x<=n)
recording what it's a proof of. But that's very much a (fairly sophisticated) refinement.)
I wonder if something along these lines might be a more principled way of dealing with these seg-fault-y things?
Thanks Simon! @acl33 I'll update the top comment to include some "index" examples.
And then, let's try to work through the "evidence" plan. I wonder if something like this can help?
(index i a) -> (assert (lt i (size a)) (index$throws i a))
(assert p e)
<-> (if p e (assert False))
(if p (assert p e) f)
-> (if p e f)
if
unless marked $throws
?Notice that assert
is one of our few control-flow constructs (that's why it's explicitly in the AST, like if
).
So then we might see (artist's impression, probably wrong)
(let (n (size arr))
(if (x <= n) (index arr x) 7)
->
(let (n (size arr))
(if (x <= n) (assert (x < (size arr)) (index$throws arr x) 7)
->
(if (x <= (size arr)) (assert (x < (size arr)) (index$throws arr x) 7)
->
(if (x <= (size arr)) (index$throws arr x) 7)
So yes to the specialist rules and OTP; and yes to figuring out how we do evidence etc.
Notice that assert is one of our few control-flow constructs
I would prefer to turn it into a dataflow construct, by making the assert (and some conditionals) produce an evidence value (aka proof object, erased at runtime) that we can pass to index
and similar friends.
@simonpj perhaps the two approaches are not so different - suppose the assert
produces the evidence which index$throws
requires? As you say, that prevents lifting the index$throws
out of the assert
, yet without ever making the end-user-programmer write this evidence thing.
And, we've now defined that our index
is checked (raises an AssertionError, rather than a HardwareIJustLaunchedTheNukeError), too. In other compilers index$throws
is called index_unsafe
or similar (without such a thorough handling of evidence). The computations that can't be lifted are then a small set including assert
and any functions, like index
, that have rewrites to an assert+unsafe form. index_unsafe
can be lifted; and asserts are dealt with by four extra rules:
(if p (assert p t) f) <==> (if p t f) <==> (if p t (assert (not p) f))
I think both true and false arms of the if need to be given appropriate evidence? e.g.
(let (n (size x)) (if (gt i n) -1 (index i x)))
to return a default value of -1 for out-of-range. But we still need to define how the evidence is passed - is the body of assert and each arm of an if
now a lambda with an evidence parameter?
Also for
(tuple (index i v) (if p (index i v) f))
, i.e. where we don't necessarily know the inner index
is safe, merely that the exception would be thrown anyway - do we somehow need to make the inner index
use the same evidence as the outer index, rather than its own evidence? (or maybe we can CSE the two evidences themselves?)
suppose the assert produces the evidence which index$throws requires?
Yes, that's the idea. As you say, though, we need to say how the evidence is produced. Something like this, in Haskell
data BoolWithProof = TrueWith Proof | FalseWith Proof
eqInt :: (Int, Int) -> BoolWith Proof
inRange :: (Int,Int,Int) -> BoolWithProof -- inRange (lo, i, hi) returns (TrueWith p) iff lo <= i < hi
index :: (Int,Proof, Array a) -> a
Now we can say something like
case inRange (lo, i, hi) of
FalseWith _ -> throw-exception
TrueWith p -> .....(index (i, p, arr))....
Here, nothing stops us using the wrong Proof with the wrong array; and nothing stops us using the wrong array bounds. But we then can be sure that we won't lift failing operation outside its guard rails.
We don't yet have case-expressions, or indeed sum types, in ksc. But I think we are going to want them.
I have not been consistent over the naming here, and I should be: in some places "lift_bind" is used, and in others "lift_let".
(foo a (let (x e) body)) ===> (let (x e) (foo a body))
For me, it is quite clear that the "let" is lifted. i,e. "lift_let" is a good name.
Naming standardized on Lift_let / lift_if, not lift_bind.
We don't yet have case-expressions, or indeed sum types, in ksc. But I think we are going to want them.
Let's move to #972; it seems possible to do this without sum-types, and (according to taste) even without new expressions.
Do we want to abandon this in favour of an "if-with-evidence" approach (some discussion in #972), merge it, or something else?
I think if-with-evidence allows us to lift some functions out of ifs, if we have the evidence. But if we can't do that, the conservative position has to be not to allow lifting out of if, as here?
I think it's fine to merge the safe rules here, and then add the evidence-based ones when decided.
This addresses AB#19240, adding a bunch of rules for lifting let(-binds) and if(-conditions) and binds, roughly of the form
These are quite complex rules with numerous side conditions, for example let-variable escape (
(let (x e1) (let (y e2) body))
==>(let (y e2) (let (x e1) body))
only ifx
is not free ine2
).In particular we allow lifting a computation out of one arm of an If only when we know that evaluating the computation before/without testing the if condition, cannot raise an exception (that would change program behaviour). For now an extremely conservative check is used, and this contrasts with RLO which allowed lifting any computation, for example RLO would have allowed:
This raises some subtle cases, which I've tried to handle via extra rules "lift_let_over_if_both", "lift_let_over_if_true_dominated", and "lift_let_over_if_false_dominated", but this is only partially successful: there are still cases where RLO would have lifted an expression but ksc will not, where the lifting is safe (regardless of the expression) because the expression is guaranteed to execute anyway. Specifically:
The ksc rewrites here will only allow lifting in the case where the
......
is empty, but this is safe even when ..... is non-empty. As part of One Theorem Prover, some kind of "let-pushing-down" (let-dropping?) rule will be needed here, not in this PR.