Open toelli-msft opened 4 years ago
You mean to the IR? It's quite a sugary change? We're trying to reduce sugar in the IR. If we want a lisp-like surface, we should just translate Scheme, Racket, or Clojure.
Ah I see. Yes, that would go against our wish to minimise our core IR.
This came up again in the context of #274 -- let's discuss more :)
From #274:
We can, without much trouble, just allow existing let
to pattern match:
(let ((a1 a2 a3 ... an) t) body)
And it would make some of the tupling/detupling in our AD look better.
This change has more impact across the rest of the codebase than I expected. To take a random handful of functions that will need updating to accommodate this change:
substEMayCapture
substExpr
optLetsE
occAnalE
Typechecking too! And optimisation. It's a far reaching change.
This is assuming we don't just do it as a preprocess? And that we can't just do that, because we need it done deeply for dup? So what's the time estimate? It feels of the order of removing vector sizes?
Yes, removing vector sizes seems like a good order-of-magnitude equivalent.
We could start by implementing it as a preprocess. That would take between a couple of hours and half a day. A preprocess wouldn't be sufficient to allow SUL-based AD to work though. SUL-based AD needs actual untupling in the AST.
Thinking of it as pre-processing step makes this only a programmer convenience, which is not a high priority. I would say not worth the bother.
The main point that I can see is that it allows us to define a single-use variant of ksc: a bound variable is used exactly once (or maybe at most once) in its scope. So we can say
let (x1, x2) = dup x in ...x1...x2...
If we didn't have pattern-matching let we couldn't do that. We'd have to say
let t = dup x in
let x1 = fst t in
let x2 = snd t in
...x1...x2....
and now t
is used twice, which defeats the whole object of the exercise.
TL;DR: I think there is only a point in putting a pattern matching let into ksc if we carry it right through, as part of the core language, understood by the optimiser, rewriter, code generator, etc. But let's keep its as simple as poss: just one level of pattern match.
We could then desugar away multi-arg functions
f (x, y) = blah
So the additional complexity does allow a small simplification too. is desugared to
f t = let (x,y) = t in blah
Ah yes, it's nice to think of this as giving us "Single Use Form" as opposed to "Single Use Language". If we have tuple unpacking we can embed the Single Use Language inside KS. I will look it more detail about what exactly needs to be done.
I believe this introduces a parse ambiguity:
(let (l r) body) ; single binding
(let ((l r) (f 3)) body) ; multiple bindings l=r, f=3, OR single tuple-binding (l, r) = f(3)
(let ((l1 r1) (l2 r2) (l3 r3)) body) ; multiple bindings
Potential fixes:
(let ((tuple l1 r1) (l2 r2)) body) ; tuple (l1 r1) bound to call (l2 r2)
(let (((tuple l1 r1) (l2 r2)) (x 3) ((tuple p q) t)) body) ; multiple bindings
2. Require even single bindings to be nested lists, i.e. replace
```lisp
(let (l r) body) ; single binding
with
(let ((l r)) body) ; single binding
(let ((a (f 1))
(b (f 2))
(c (f 3)))
body)
with
(let (a (f 1))
(let (b (f 2))
(let (c (f 3))
body)))
I kind of prefer this, as most autogenerated ks won't use the sugared form, and a profusion of let
s makes it easier to see bindings, and the pretty printer already left-aligns lets. It also removes some really unnecessary sugar in place of a needed construct (for SUF).
@toelli-msft @simonpj
I'm fine with requiring multiple lets for multiple bindings.
Interesting, I hadn't realised that. Yes it's ambiguous and what happens will be determined by the operational semantics of the Parsec parser. broken
does not work; the works...
functions do work. It seems like it tries to parse the first list as a list of binders and if it can then it assumes it's a tuple-unpacking let. If it can't parse the first list as a list of binders then it tries again to parse it, this time as a binder and as expression.
(def broken Float ((x : Float) (y : Float))
(let ((a x) (bar y))
(add a bar)))
(def works Float ((x : Float) (y : Float))
(let ((a z) (bar y))
(add a z)))
(def works_also Float ((x : Float) (y : Float))
(let ((a 1.0) (bar y))
(add a bar)))
(def works_also2 Float ((x : Float) (y : Float))
(let ((a (neg x)) (bar y))
(add a bar)))
Thanks Tom. Which of the above options 1-3 do you prefer? Simon and I seem to be plumping for 3.
I still prefer (let ((tuple x y) t) ...)
, firstly because it seems strange to me to require a special form (tuple
) for constructing tuples but not take advantage of it in the pattern match syntax, and secondly because (KINAUFL notwithstanding) I do hand-edit a lot of complex .ks
files (like gmm.ks
and adbench-lstm.ks
) and the sugar for multiple bindings saves a lot of closing parens that would otherwise be awkward.
So,
(tuple a b c)
constructs. But up here in syntax land, tuple is not distinguishable from any other "n-arg" function. From up here, it's a function call, not a keyword. There's also (vec 1 2 3)
, which is a function call.let
s,, but when pretty printing files like gmm and bert, I found I was preferring to see the chain of let
s, and the closing parens are in my view less onerous than typing out the let. Most LISP style guides say something like Place all trailing parentheses on a single line, which leads to code like this, which I find quite readable.
; This example calculates (the number of positive elements, and their product) in a vector.
(def fold_example (Tuple Integer Float) (v : Vec Float)
(fold (lam (s_vi : Tuple (Tuple Integer Float) Float)
(let ((s vi) s_vi)
(let ((s_num s_prod) s)
(if (gt vi 0.0)
(tuple (add s_num 1) (mul s_prod vi))
(tuple s_num s_prod)))))
(tuple 0 1.0)
v))
As it happens, the above is from syntax-primer.ks, because when updating the Python parser to take tuple-bound lets, I resolved the ambiguity using method 3.
We've discussed occasionally whether we should add tuple pattern matching. I think it's a good idea and would make a lot of our code neater. Finding that it is also supported in C++ 17 has convinced me that we should just go ahead and do it!
https://skebanga.github.io/structured-bindings/