Open jeisner opened 11 years ago
p.s. I really like this design and this presentation of it. I got close to it in inference2.tex, where I played around with a couple of variants. But this version feels like the One True Path.
Priority?
On Tue, Jul 2, 2013 at 12:04 AM, Jason Eisner notifications@github.comwrote:
p.s. I really like this design and this presentation of it. I got close to it in inference2.tex, where I played around with a couple of variants. But this version feels like the One True Path.
— Reply to this email directly or view it on GitHubhttps://github.com/nwf/dyna/issues/34#issuecomment-20325236 .
The right way to implement
||
is ...
Or, just define it as a built-in with non-standard behavior. We already have ?foo
as such a non-standard built-in: like foo || bar
, it doesn't necessarily fail on null arguments. Another non-standard built-in is expression-if, if x then y else z
(perhaps also expressible as x ? y : z
), which is equivalent to
temp = y if x.
temp = z if !x.
Priority?
How hard is it to implement? It would be nice for the tutorial, and for giving students more leeway to play around without getting into trouble the way I did. (We'd tell them about the lenient versions and not about the strict versions.) However, I am not sure it is crucial for anything in the LI class.
This is really not easy to fix. Our ANF has obliterated whatever structure we had, so "foo && bar" is now "F is foo, B is bar, R is F && B, R", and that of course turns into loops which bail out if either foo or bar is null. Null is not plumbed through the expression as a value in its own right and is not available to primitives. This has, AFAIK, always been our understanding of expressions -- null is an annihilator; thus ? and other "N+1-state operators" operators need to avoid evaluation within the current rule.
Thus, we would need to define &&/2 and ||/2 to quote their arguments (just as with ?/1, which we don't have either yet, for the same reason; see below) and then write (or inline)
X && Y := false.
X && Y := true for *X & *Y.
making use of indirect evaluation, which we cannot do right now. There's a reserved word in DOpAMine for it, but that's about as far as we've gotten down this path.
(?/1's definition is ?(X) := false. ?(X) := true for _ is *X.
Note that for _ is X
does not work, since variables never autoevaluate, so is
placing its rhs in such a context does nobody any good.)
(There's also something of a mixed bag here for our future type system -- presumably we are required to pass only terms which are statically known to be evaluable to such operators? Or maybe we just have to statically know whether the arguments are evaluable and can then inline the definition so that we do not have to actually do the evaluation? Dynamic dispatch seems like it would require some interesting encoding tricks in the run-time system.)
So let's wait until we are farther along on indirect evaluation (or special forms) until we try this.
The alternative, as I mentioned, is to transform ||
out of the program and replace it with a temporary variable aggregated with ||=
. But that would screw up the way that we report expressions involving ||
in queries.
I like your more principled approach better, namely for ||
to quote its arguments and then we can define
(X || Y) ||= *X.
(X || Y) ||= *Y.
(@nwf, it is ||
that is the hard one. Your translation of &&
is wrong. In fact &&
is identical to &
except for error handling.)
The if-then-else operator. The "branch not taken" is allowed to be null or $error
without corrupting the branch that is taken. I think we should pull the same trick as with ||
above, and define
(if X then Y else Z) = *Y for X.
(if X then Y else Z) = *Z for !X.
where the if-then-else operator suppresses autoevaluation of its Y and Z arguments. Note that this definition piggybacks on the comma definition below (via for
which is sugar for comma with args reversed).
Using the comma definition below implies that if X itself is $error
, then the result is always $error
unless Y and Z are both null. It is conceivable, however, that we should change this so that if Y and Z have the same value, then the result is that value even if X is in error.
Note that this apparently equivalent definition wouldn't work, because then (if $error then 0=1 else 123)
will give 123
instead of $error
:
(if X then Y else Z) := *Z.
(if X then Y else Z) := *Y for X.
Remark: The simpler (if X then Y)
(assuming we allow it) should be interpreted as just the first rule above, or equivalently as (if X then Y else 0=1)
. This makes it yet another synonym for X,Y
or Y for X
.
The = and := aggregators. Remember that if the winning aggregand is $null
then we say that the result is null. This is different from the other cases on this ticket, because it is about null output rather than null input.
Annihilations? We've considered the possibility that 0 * $error
should give 0
, that infinity max $error
should give infinity
, etc.
Certain aggregations.
:=
should allow error values to be overridden by later non-error values. *=
returns 0 if any of its aggregands are 0, etc.The comma operator. The comma operator is left-to-right lenient with respect to errors:
false,$error
just returns null because we never get to the error value. (For example, foo(X) = X > 0, 1/X
is explicitly guarding the division to avoid an error, and shouldn't be poisoned by the error that would result from actually doing the division.)
But conversely, $error,X
always returns $error
(because we "don't know" whether the test succeeded or not). ... except when X
is null in which case I think it returns null (so that if an aggregand X
doesn't even exist we don't have to worry about whether its guards evaluate sensibly).
So comma is not commutative because false,$error
is more lenient than $error,false
. In this way, comma differs from &&
(in which false
always beats $error
) as well as from &
(in which $error
always beats false
).
(In fact, it is a lot like left-to-right &&
in C, Perl, etc.)
A comma sequence A,B,C,D,E
should be parsed as (A,(B,(C,(D,E))))
. This turns out to be equivalent to (A && B && C && D), E
, which is good because it means that it treats null and false equivalently in the guards, as we expect. That is, it behaves like this:
false
, then return null$error
, then return $error
A,B,C,D
must be true since they are required to be boolean)So we see that comma is not associative either. We just saw that A,B,C
means A,(B,C)
which is equivalent to (A && B),C
which is not equivalent to (A,B),C
. Specifically, $error,false,123
is null, but ($error,false),123
gives $error
.
(What's happening with ($error,false),123
is that we "don't know" the value of the guard ($error,false)
-- it comes out as $error
which pollutes the whole thing. We've "lost track" of the fact that (X,false)
is always going to be false or null so that it will nix the 123 regardless -- once it's in error, it might as well be true for all we know. This does interfere with optimization, or a future fine-grained error system where if (X,false)
is tightly typed as having type :false
, then ($error,false)
gives an error value of type :false
which might be good enough to annihilate the whole expression ($error,false),123
.)
Remark: Some versions of LISP used basically the trick that we are using here: that some functions (defined with nlambda
instead of lambda
) suppress the usual automatic evaluation of their arguments, but are free to call eval on them.
Common Lisp instead decided to distinguish between functions and macros. It does have a few "special forms" -- lists (func arg1 arg2 ...)
that are not evaluated by the usual mechanism of evaluating all of arg1
, arg2
, etc. and calling the function named by func
on the list of results. These special forms are the cases where func
is one of the symbols quote
, if
, catch
, let
, declare
, and a number of others. All other special evaluation stuff in Common Lisp is done, often in terms of these special forms, via macros (which we arguably don't need in Dyna because we don't need to manage side effects or inlining). From the Common Lisp manual:
The set of special forms is fixed in Common Lisp; no way is provided for the user to define more. The user can create new syntactic constructs, however, by defining macros.
The set of special forms Common Lisp is purposely kept very small because any program-analyzing program must have special knowledge about every type of special form. Such a program needs no special knowledge about macros because it is simple to expand the macro and operate on the result expansion. It goes on to say that as a matter of implementation convenience or efficiency, compilers may want to change which forms are handled as special forms vs. macros, provided that other program transformers and analyzers can rely on the definition given here. This is discussed carefully.
|
and&
are strict boolean operators. (No, not bitwise integer operators, I don't think.) Like other normal operators, they return null if they have any null argument, and otherwise $error if they have any $error argument.But we also need lenient versions that will treat null arguments sort of like false (three-valued logic). I just fell into the trap myself by using a strict operator in
This returned only sentences S that contained both "who" and "what", because the
|
expression fails if either argument is null. I should have written the query asso that
null
values would not cause the expression to fail.||
and||=
behave like this:And
&&
and&&=
are the same but upside-down:However, nulls are never passed to aggregators (an aggregator receives a set of non-null values). Hence the test "if any arguments are null" cannot succeed with
&&=
, which only returns null if all of its arguments are null.Thus,
||=
is just like|=
except thattrue
beats$error
rather than vice-versa. And&&=
is just like&=
except thatfalse
beats$error
rather than vice-versa. The fact that errors can be masked is similar to the way that lazy||
and&&
are often used in C and its descendants, e.g.,except that our version doesn't care about the order of the arguments.
Note that
&&
is a normal operator in that null arguments cause it to fail and return null. However,||
is crucially not normal in this way! The right way to implement||
is to first define||=
and then say thatfoo || bar
is syntactic sugar for the temp item defined by[5/29/17: Or I think we could just say that
foo || bar
is the same as(||= foo; bar)
, couldn't we? Note again thatfoo && bar
can't be written in this way. These are however the same as(|= foo; bar)
andfoo & bar
except that those have stricter error handling in this design, and it's not totally clear that we ever really want that strict error handling.]