Open masak opened 4 years ago
Here's a general request to myself to do the same kind of analysis for best
. Would be interesting.
Also, the code from #136 here — specifically, can the ccc
be optimized away after enough inlining? (I think it can.)
I found another fun one yesterday: even
does not work on chars, because you get a cdr-on-atom
error thrown from c*
. But the fact that you will is statically knowable, because all that happens all the way from calling (even \x)
to getting the error in c*
could be inlined and abstractly evaluated.
Show this. :smile: Again, I'm making a request to my future self. Even better, have a goodenuf static "type checker" (or argument checker) that could flag up this kind of error.
Two more focused questions:
Which global definition is the first one that would benefit from interleaved inlining/specialization?
Could randlen
be specialized to the extent that it contains only the direct computation of the number, something like a repeated multiplication and addition with a (coin)
call in the middle? I.e. is it reasonable to expect that an abstract evaluator can specialize the code all the way down to that?
- Which global definition is the first one that would benefit from interleaved inlining/specialization?
I think the answer might be snoc
and list
, both of which would benefit from inlining and specializing an append2
function, an already specialized two-argument version of append
.
The only other earlier candidate is cons
, but that looks to me like a pure inlining, with no interleaved specialization.
Here comes the successive simplification of the board-movements
code.
We start with the following code (nicely annotated with comments):
;; This template defines a robot with coordinates (x, y) and direction
;; vector (dx, dy). The coordinate system has the positive y direction
;; pointing downwards, and (0, 0) is in the upper left corner.
;; The robot is not allowed to leave the rectangle spanned by (0, 0)
;; (inclusive) and (w, h) (exclusive). Note that the (w, h) coordinates
;; are not specified here; they are parameters to `run` and `runcmd`.
(tem robot x 0 y 0 dx 0 dy -1)
;; A common pattern in Bel is to define a global, initially an empty
;; list to act as a namespace for some given purpose...
(set actions nil)
;; ...and then to define a macro that will add things to this global.
;; Here, we want to define five actions that control the robot.
(mac defaction (cmd name parms . body)
`(push (cmd . (fn ,parms ,@body)) actions))
;; The first action is a bit of an exception. It aborts the run early
;; so that no more commands are executed. This is achieved through a
;; `throw` action, a non-local command which will find us back in
;; `run`, the top-level function. The list passed to `throw`
;; conveniently serves as a return value.
(defaction 0 quit (r)
(throw (list r!x r!y)))
;; This helper function makes it easy to talk about moving the robot
;; `r` forwards or backwards, as specified by the operator `op`.
;; Pass `+` and you move the robot forwards; pass `-` and you move it
;; backwards. This is then used exactly in this way in the two
;; subsequent actions.
(def move (op r)
(zap op r!x r!dx)
(zap op r!y r!dy))
(defaction 1 forward (r)
(move + r))
(defaction 2 backward (r)
(move - r))
;; Changing the robot's direction comes down to cycling through the
;; four directions (0, -1), (1, 0), (0, 1), and (-1, 0); forwards in
;; this list if we're going clockwise, otherwise backwards. There are
;; many ways to do this, but the below is perhaps the simplest -- the
;; `swap` makes sure to juggle the 1 between the x and y coordinates,
;; and the `zap inv` switches the sign at the appropriate moments.
(defaction 3 turncw (r)
(swap r!dx r!dy)
(zap inv r!dx))
(defaction 4 turnccw (r)
(swap r!dx r!dy)
(zap inv r!dy))
;; If the robot ever leaves its designated rectangle, we throw an
;; exception and the run aborts early so that no more commands are
;; executed. This is then checked after every command, although
;; technically it's only necessary to check after forward and
;; backwards moves.
(def checkbounds (v max)
(or (<= 0 v (- max 1))
(throw 'impossible-move)))
;; To run an individual command, we look up the `cmd` number in
;; our `actions` association list. If we find it there, we run its
;; associated action and check the bounds. If we don't, we throw
;; an exception and abort early so that no more commands are
;; executed.
(def runcmd (r w h cmd)
(aif (get cmd actions)
(let action (cdr it)
(action r)
(checkbounds r!x w)
(checkbounds r!y h))
(throw 'illegal-command)))
;; Given some dimensions (w, h), an initial robot position (x, y),
;; and zero or more commands (numbers 0..4), the `run` function
;; executes the commands one by one and returns the robot's final
;; coordinates. `catch` is the context that all three `throw`s
;; above bubble up to. `map` is the most idiomatic way to write
;; `foreach` in Bel.
(def run ((w h) (x y) cmds)
(let r (make robot x x y y)
(catch
(map [runcmd r w h _] cmds)
(list r!x r!y))))
I'm going to wish away the comments, just to make it easier now to do code transformations.
The first thing I see that we can do is to inline runcmd
completely. So let's do that.
@@ -30,16 +30,15 @@
(or (<= 0 v (- max 1))
(throw 'impossible-move)))
-(def runcmd (r w h cmd)
- (aif (get cmd actions)
- (let action (cdr it)
- (action r)
- (checkbounds r!x w)
- (checkbounds r!y h))
- (throw 'illegal-command)))
-
(def run ((w h) (x y) cmds)
(let r (make robot x x y y)
(catch
- (map [runcmd r w h _] cmds)
+ (map (fn (cmd)
+ (aif (get cmd actions)
+ (let action (cdr it)
+ (action r)
+ (checkbounds r!x w)
+ (checkbounds r!y h))
+ (throw 'illegal-command)))
+ cmds)
(list r!x r!y))))
We will never get rid of that map
, by the way. cmds
is a parameter to the (topmost) run
function, and unknown to us.
Next up we do a big inlining of the entire actions
data structure. (Actually, an intermediate step in this one would probably be to create the actions
list in code first.) This is a pretty big step, but let's say we can do it like this.
@@ -1,31 +1,9 @@
(tem robot x 0 y 0 dx 0 dy -1)
-(set actions nil)
-
-(mac defaction (cmd name parms . body)
- `(push (cmd . (fn ,parms ,@body)) actions))
-
-(defaction 0 quit (r)
- (throw (list r!x r!y)))
-
(def move (op r)
(zap op r!x r!dx)
(zap op r!y r!dy))
-(defaction 1 forward (r)
- (move + r))
-
-(defaction 2 backward (r)
- (move - r))
-
-(defaction 3 turncw (r)
- (swap r!dx r!dy)
- (zap inv r!dx))
-
-(defaction 4 turnccw (r)
- (swap r!dx r!dy)
- (zap inv r!dy))
-
(def checkbounds (v max)
(or (<= 0 v (- max 1))
(throw 'impossible-move)))
@@ -34,11 +12,29 @@
(let r (make robot x x y y)
(catch
(map (fn (cmd)
- (aif (get cmd actions)
- (let action (cdr it)
- (action r)
- (checkbounds r!x w)
- (checkbounds r!y h))
- (throw 'illegal-command)))
+ (case cmd
+ 0 (do
+ (throw (list r!x r!y))
+ (checkbounds r!x w)
+ (checkbounds r!y h))
+ 1 (do
+ (move + r)
+ (checkbounds r!x w)
+ (checkbounds r!y h))
+ 2 (do
+ (move - r)
+ (checkbounds r!x w)
+ (checkbounds r!y h))
+ 3 (do
+ (swap r!dx r!dy)
+ (zap inv r!dx)
+ (checkbounds r!x w)
+ (checkbounds r!y h))
+ 4 (do
+ (swap r!dx r!dy)
+ (zap inv r!dy)
+ (checkbounds r!x w)
+ (checkbounds r!y h))
+ (throw 'illegal-command)))
cmds)
(list r!x r!y))))
Inline move
.
@@ -1,9 +1,5 @@
(tem robot x 0 y 0 dx 0 dy -1)
-(def move (op r)
- (zap op r!x r!dx)
- (zap op r!y r!dy))
-
(def checkbounds (v max)
(or (<= 0 v (- max 1))
(throw 'impossible-move)))
@@ -18,11 +14,13 @@
(checkbounds r!x w)
(checkbounds r!y h))
1 (do
- (move + r)
+ (zap + r!x r!dx)
+ (zap + r!y r!dy)
(checkbounds r!x w)
(checkbounds r!y h))
2 (do
- (move - r)
+ (zap - r!x r!dx)
+ (zap - r!y r!dy)
(checkbounds r!x w)
(checkbounds r!y h))
3 (do
Inline checkbounds
.
@@ -1,9 +1,5 @@
(tem robot x 0 y 0 dx 0 dy -1)
-(def checkbounds (v max)
- (or (<= 0 v (- max 1))
- (throw 'impossible-move)))
-
(def run ((w h) (x y) cmds)
(let r (make robot x x y y)
(catch
@@ -11,28 +7,38 @@
(case cmd
0 (do
(throw (list r!x r!y))
- (checkbounds r!x w)
- (checkbounds r!y h))
+ (or (<= 0 r!x (- w 1))
+ (throw 'impossible-move))
+ (or (<= 0 r!y (- h 1))
+ (throw 'impossible-move)))
1 (do
(zap + r!x r!dx)
(zap + r!y r!dy)
- (checkbounds r!x w)
- (checkbounds r!y h))
+ (or (<= 0 r!x (- w 1))
+ (throw 'impossible-move))
+ (or (<= 0 r!y (- h 1))
+ (throw 'impossible-move)))
2 (do
(zap - r!x r!dx)
(zap - r!y r!dy)
- (checkbounds r!x w)
- (checkbounds r!y h))
+ (or (<= 0 r!x (- w 1))
+ (throw 'impossible-move))
+ (or (<= 0 r!y (- h 1))
+ (throw 'impossible-move)))
3 (do
(swap r!dx r!dy)
(zap inv r!dx)
- (checkbounds r!x w)
- (checkbounds r!y h))
+ (or (<= 0 r!x (- w 1))
+ (throw 'impossible-move))
+ (or (<= 0 r!y (- h 1))
+ (throw 'impossible-move)))
4 (do
(swap r!dx r!dy)
(zap inv r!dy)
- (checkbounds r!x w)
- (checkbounds r!y h))
+ (or (<= 0 r!x (- w 1))
+ (throw 'impossible-move))
+ (or (<= 0 r!y (- h 1))
+ (throw 'impossible-move)))
(throw 'illegal-command)))
cmds)
(list r!x r!y))))
All the throw
s are now lexically nested under the catch
(and we have some kind of vague guarantee that there aren't others hiding somewhere else). We are now in a position to eliminate them all. These would actually be replaced by bytecode goto
instructions, but let's try to express them through pseudo-Bel goto
and label
forms:
@@ -2,43 +2,56 @@
(def run ((w h) (x y) cmds)
(let r (make robot x x y y)
- (catch
+ (let return-value nil
(map (fn (cmd)
(case cmd
0 (do
- (throw (list r!x r!y))
+ (set return-value (list r!x r!y))
+ (goto 'abort)
(or (<= 0 r!x (- w 1))
- (throw 'impossible-move))
+ (do (set return-value 'impossible-move)
+ (goto 'abort)))
(or (<= 0 r!y (- h 1))
- (throw 'impossible-move)))
+ (do (set return-value 'impossible-move)
+ (goto 'abort))))
1 (do
(zap + r!x r!dx)
(zap + r!y r!dy)
(or (<= 0 r!x (- w 1))
- (throw 'impossible-move))
+ (do (set return-value 'impossible-move)
+ (goto 'abort)))
(or (<= 0 r!y (- h 1))
- (throw 'impossible-move)))
+ (do (set return-value 'impossible-move)
+ (goto 'abort))))
2 (do
(zap - r!x r!dx)
(zap - r!y r!dy)
(or (<= 0 r!x (- w 1))
- (throw 'impossible-move))
+ (do (set return-value 'impossible-move)
+ (goto 'abort)))
(or (<= 0 r!y (- h 1))
- (throw 'impossible-move)))
+ (do (set return-value 'impossible-move)
+ (goto 'abort))))
3 (do
(swap r!dx r!dy)
(zap inv r!dx)
(or (<= 0 r!x (- w 1))
- (throw 'impossible-move))
+ (do (set return-value 'impossible-move)
+ (goto 'abort)))
(or (<= 0 r!y (- h 1))
- (throw 'impossible-move)))
+ (do (set return-value 'impossible-move)
+ (goto 'abort))))
4 (do
(swap r!dx r!dy)
(zap inv r!dy)
(or (<= 0 r!x (- w 1))
- (throw 'impossible-move))
+ (do (set return-value 'impossible-move)
+ (goto 'abort)))
(or (<= 0 r!y (- h 1))
- (throw 'impossible-move)))
+ (do (set return-value 'impossible-move)
+ (goto 'abort))))
(throw 'illegal-command)))
cmds)
- (list r!x r!y))))
+ (set return-value (list r!x r!y))
+ (label 'abort)
+ return-value)))
Through some additional data and control flow analysis, we can eliminate some redundant checks.
@@ -7,49 +7,25 @@
(case cmd
0 (do
(set return-value (list r!x r!y))
- (goto 'abort)
- (or (<= 0 r!x (- w 1))
- (do (set return-value 'impossible-move)
- (goto 'abort)))
- (or (<= 0 r!y (- h 1))
- (do (set return-value 'impossible-move)
- (goto 'abort))))
+ (goto 'abort))
1 (do
(zap + r!x r!dx)
(zap + r!y r!dy)
- (or (<= 0 r!x (- w 1))
- (do (set return-value 'impossible-move)
- (goto 'abort)))
- (or (<= 0 r!y (- h 1))
+ (or (and (<= 0 r!x (- w 1)) (<= 0 r!y (- h 1)))
(do (set return-value 'impossible-move)
(goto 'abort))))
2 (do
(zap - r!x r!dx)
(zap - r!y r!dy)
- (or (<= 0 r!x (- w 1))
- (do (set return-value 'impossible-move)
- (goto 'abort)))
- (or (<= 0 r!y (- h 1))
+ (or (and (<= 0 r!x (- w 1)) (<= 0 r!y (- h 1)))
(do (set return-value 'impossible-move)
(goto 'abort))))
3 (do
(swap r!dx r!dy)
- (zap inv r!dx)
- (or (<= 0 r!x (- w 1))
- (do (set return-value 'impossible-move)
- (goto 'abort)))
- (or (<= 0 r!y (- h 1))
- (do (set return-value 'impossible-move)
- (goto 'abort))))
+ (zap inv r!dx))
4 (do
(swap r!dx r!dy)
- (zap inv r!dy)
- (or (<= 0 r!x (- w 1))
- (do (set return-value 'impossible-move)
- (goto 'abort)))
- (or (<= 0 r!y (- h 1))
- (do (set return-value 'impossible-move)
- (goto 'abort))))
+ (zap inv r!dy))
(throw 'illegal-command)))
cmds)
(set return-value (list r!x r!y))
We are now in a position to eliminate the robot itself, through scalar elimination. Goodbye, robot!
@@ -1,33 +1,31 @@
-(tem robot x 0 y 0 dx 0 dy -1)
-
(def run ((w h) (x y) cmds)
- (let r (make robot x x y y)
+ (let (dx dy) (list 0 -1)
(let return-value nil
(map (fn (cmd)
(case cmd
0 (do
- (set return-value (list r!x r!y))
+ (set return-value (list x y))
(goto 'abort))
1 (do
- (zap + r!x r!dx)
- (zap + r!y r!dy)
- (or (and (<= 0 r!x (- w 1)) (<= 0 r!y (- h 1)))
+ (zap + x dx)
+ (zap + y dy)
+ (or (and (<= 0 x (- w 1)) (<= 0 y (- h 1)))
(do (set return-value 'impossible-move)
(goto 'abort))))
2 (do
- (zap - r!x r!dx)
- (zap - r!y r!dy)
- (or (and (<= 0 r!x (- w 1)) (<= 0 r!y (- h 1)))
+ (zap - x dx)
+ (zap - y dy)
+ (or (and (<= 0 x (- w 1)) (<= 0 y (- h 1)))
(do (set return-value 'impossible-move)
(goto 'abort))))
3 (do
- (swap r!dx r!dy)
- (zap inv r!dx))
+ (swap dx dy)
+ (zap inv dx))
4 (do
- (swap r!dx r!dy)
- (zap inv r!dy))
+ (swap dx dy)
+ (zap inv dy))
(throw 'illegal-command)))
cmds)
- (set return-value (list r!x r!y))
+ (set return-value (list x y))
(label 'abort)
return-value)))
We're left with this quite cute minimized version:
(def run ((w h) (x y) cmds)
(let (dx dy) (list 0 -1)
(let return-value nil
(map (fn (cmd)
(case cmd
0 (do
(set return-value (list x y))
(goto 'abort))
1 (do
(zap + x dx)
(zap + y dy)
(or (and (<= 0 x (- w 1)) (<= 0 y (- h 1)))
(do (set return-value 'impossible-move)
(goto 'abort))))
2 (do
(zap - x dx)
(zap - y dy)
(or (and (<= 0 x (- w 1)) (<= 0 y (- h 1)))
(do (set return-value 'impossible-move)
(goto 'abort))))
3 (do
(swap dx dy)
(zap inv dx))
4 (do
(swap dx dy)
(zap inv dy))
(throw 'illegal-command)))
cmds)
(set return-value (list x y))
(label 'abort)
return-value)))
It's not so bad, actually. A lot less code duplication than I would have guessed. Fits on a screen. The biggest shortcoming is the label
/goto
fiction, which would make this difficult to run in practice.
Hm, could have also turned the case
into an if
along the way.
The map
is irreducible, but at least a compiler can turn it into a loop and eliminate the anonymous function. This is basically necessary, since we need to be able to do intra-procedural jumps from the loop body to the outside.
One more request: do dedup
, see if those rev
and part
calls can be eliminated altogether.
When I see this function definition:
...my first thought is that it's wasteful and extravagant to use
=
and notid
, and that this would be completely equivalent:Could a sufficiently smart compiler know that the former could be reduced to the latter? I say yes, and I will now prove that with a sequence of well-explained transformations (expressed as colorful diffs).
The first thing that we will evaluate is the subexpression
(type x)
, so we can extract it as a variable, and type-annotate it.By the way, every time we (the abstract evaluator) insert a
let
, think of it as using fresh/unique variables. We're using regular named/unhygienic variables here for ease of reading.We inline all of
=
:Turning our attention to the first condition in the
if
, we know statically that there's no way it can be true. So we eliminate it.Similarly, the next condition is true, as a number of steps of inlining and abstractly evaluating the
some
function call would show. We can therefore "bind" to that consequent and discard the rest. As a side effect, we've now discarded the=
recursion.Since
args
is a named list literal, we can abstractly evaluate(car args)
and(cdr args)
into their known values:...and eliminate the unused
let
:Now we inline
all
:Again, because of static knowledge of
xs
, we can eliminate the first condition of theif
entirely:We abstractly evaluate the destructurings on
xs
:And we can inline/expand the call to
f
:The recursive call to
all
hasnil
as its argument toxs
. Either we do enough small steps to realize that the result ist
, or we have a separate rule/pattern for that:Now the refcounts of the parameters bound to
all
have gone to zero:id
already gives backt
ornil
, so theif
is redundant:Finally, we inline
(type x)
again:I thought before writing this out that an abstract evaluator would be quite straightforward to write. Now, I'm even more convinced. Furthermore (I realize), this is quite well-suited for a test-driven approach.
Can start on something smaller, of course. But the above steps are fairly realistic to do by machine.