Open simonpj opened 3 years ago
I'm fairly strongly opposed to this without additional rationale. It will churn the codebase in return for minimal benefit, as far as I can see.
Regarding the reasons to do this:
It's simpler
The proof of the simplification pudding will be in the PR eating so let's see how https://github.com/microsoft/knossos-ksc/pull/541 turns out. (I would prefer the PR to pass the tests before I review it.)
It's more beautiful (the intro/elim duality)
I'm puzzled by the duality here. To be completely symmetrical if we were to change def_pat
to take a TVar
data DefX p -- f x = e
= Def { def_fun :: Fun
, def_pat :: TVar <------------ exactly one arg
, def_res_ty :: TypeX
, def_rhs :: RhsX p }
wouldn't duality/beauty imply that App
[1] should take a Var
?
data ExprX p
= ...
| App (ExprX p) (Var p) <-------- exactly one arg
...
In fact the status quo feels more symmetrical to me than the proposal. An App
can be passed a variable as an argument; a Def
can have a variable pattern as an argument. An App
can be passed a literal tuple as an argument; a Def
can have a tuple pattern as an argument. The only lack of symmetry is that an App
can be passed an arbitrary expression as an argument but there is no "arbitrary expression pattern" (view pattern).
To me "symmetry" makes the case for either only allowing App
to be applied to a variable or adding view patterns. Removing tuple patterns doesn't seem to increase symmetry.
Perhaps there is some type theoretic aspect regarding "intro/elim" duality that I'm missing.
Currently we have to ensure that ... must generate the same C++
I think this falls under "simplicity" so as above let's observe https://github.com/microsoft/knossos-ksc/pull/541 to see how this plays out in practice
Regarding the reasons not to do this
It's work
Agreed
It doesn't fix a bug
It is true that it doesn't fix a bug (or add a new feature) but I believe that simplifying the codebase should also be a high priority activity, so I don't object on this ground.
We specifically added Pat to Def in this commit ...
If this functionality is no longer needed for its original purpose, and has not picked up a new specific purpose since, then I agree that removing it should not be prevented by it having previously been added for some purpose! However, see my comments about churn below.
My objection is mainly an objection to churn.
When we started, Def
s had a list of arguments and Call
s had a single argument. In March 2019 we changed Call
s to take a list of arguments (bd7fe43eaf75f0182b11e5b680bc8fef1ea147f6). Then in Feb 2020 we made everything have a single argument (specifically Def
s to take only a Var
) (e47f152b6b0f7d6de40d9ca3507f241c262697be). Then in May 2020 we allowed tuple patterns in Def
s as well as variable patterns (74ffe5481bdb574c3df8f1ccb6768e8e5fa035df). We're now proposing a fourth iteration!
None of the changes were necessary. All the changes were for convenience. In fact the only similar change that actually was necessary[2] was my suggestion for tuple pattern matching in let which took eight months to be accepted (5fe54363c24cbcda90c8f3ec486175f985c32cf1) (and is now touted as the reason that we don't need tuple patterns in Def
s).
If I really believed that this would be the final iteration then I would be more comfortable with this change. However, I think that whatever we choose we'll come across some rough edges, specifically because
I fear that we'll make this change, churn the codebase, incur the cost of reviewer time and effort, modify everyone's existing understanding of how things work pervasively across the codebase, and yet in six months we'll become irritated with the new status quo and want to change again!
Please let's not make another pervasive change for mere convenience. I am not aware that any aspect of ksc is hard to maintain without this change. On the other hand this change risks introducing subtle bugs and becoming inconvenient in as yet unanticipated ways.
In summary, churn has a big cost. In order to want to pay the cost I think we need positive answers to two questions:
[1] actually we use Call
in practice, but the same applies
[2] for SUF/BOG-AD
My own concern here is that we're talking about the internals of ksc, but I'm not sure we've actually pinned down the KSIR semantics. As @awf noted on PR 530, ksc treats (f x y)
as being interchangeable with (f (tuple x y))
, but (f x)
has a different meaning from (f (tuple x))
. I'm not sure whether we've actually discussed this and made it part of our spec. (In fact before this came up I wasn't sure whether 1-tuples were meant to be permitted at all.) Do our other components parse in the same way? I'm a bit worried that the change suggested here would bake in this way of doing things into ksc, before we've had a chance to decide whether it actually has the semantics that we want.
I'm not sure we've actually pinned down the KSIR semantics
Your question is about whether one-tuples are different from their singleton contents. We do have that pinned down! At the moment
It's possible that some parts of our codebase don't respect this semantics, but it's the status quo.
It would be quite possible to allow 1-tuples:
A tiny way in which 1-tuples would complicate matters is this. Currently we parse (f a b)
to (App f (Tuple [a,b]))
, but we parse (f a)
to just (App f a)
. I think that, if we had 1-tuples, we'd still want to parse (f a)
to (App f a)
; to get (App f (Tuple [a])
you'd have to write (f (tuple1 a))
. Which is fine.
But anyway, all this belongs on a separate ticket. It's a relatively unimportant choice, and one that is largely unrelated to this ticket I think. (Which is good -- we can make independent decisions.)
For the record, there is a technicality around singleton tuples, which is that most parts of ksc do actually support them (perhaps by accident). To the best of my knowledge the only ksc component that doesn't support them is linear map AD. In fact the $check
primitive depends on singleton tuples in an essential way (perhaps it ought not to). The parser supports them, the Expr
type supports them, the Type
type supports them, the backends support them etc..
For example, the following compiles fine if linear map AD is disabled:
(def singleton (Tuple Float) (xt : Tuple Integer)
(let ((x) xt)
(tuple (to_float x))))
(def use_singleton Float (ignored : Tuple)
(let ((x) (singleton (tuple 1)))
(add x x)))
I have a branch which statically forbids singleton tuples (toelli/no-singleton
) but I don't really think it's worth doing so.
In any case, as Simon says, the (f a)
/(f a b)
/(f (tuple a))
/(f (tuple a b))
business is a parsing issue only, I hope that is sufficiently well explained by the notes in Parse.hs
and Lang.hs
. If not then let's improve the notes.
For the record, there is a technicality around singleton tuples, which is that most parts of ksc do actually support them (perhaps by accident).
I noticed when reviewing the large polymorphism PR, that Cgen doesn't fully support this because functions which take a single singleton-tuple argument are called/mangled as if they took a non-tuple argument. So although these functions are treated as being different by ksc, they can't actually overload each other:
(edef f Integer (Float))
(edef f Integer ((Tuple Float)))
(def main Integer () (add (f 0.0) (f (tuple 1.0))))
ty$main main(ks::allocator * $alloc) {
int c$0 = f$af($alloc, 0.0);
int c$1 = f$af($alloc, 1.0);
int c$2 = add$aii($alloc, c$0, c$1);
return (c$2);
}
Ah, nice observation! (We could overcome this obstacle though, if we wanted singleton tuples.)
And this is a mangling issue -- arguably the second f should mangle to encode("f@<f>")
rather than encode("f@f")
- we don't depend on C++ overload resolution.
Is this closeable?
Simon came up with https://github.com/microsoft/knossos-ksc/pull/541 to address this issue. However, I would prefer that we do not address this issue at this time because we have a lot of other stuff on our plate.
Currently we have:
Tuples
tuple
let pat = rhs in body
Functions
def
App
inExpr
).Currently the intro/elim for functions is asymmetrical:
My proposal is to simplify
Def
toReasons to do this:
Pat
is eitherVarPat
orTuplePat
).must generate the same C++ code. More precisely, they must have the same calling convention, since a call
f1 e
andf2 e
look identical (a tuple-typed argument) so the caller will call a C++ function with two arguments. If we only had one form, we wouldn't need to check that it has the same calling convention as the other.Possible reasons not to do this:
spj-catlang
and may or may not be resuscitated.Concerning #538, when code-generating to C++ (Cgen), we want to unpack a tuple argument into multiple arguments to the C++ function -- and we might even want to unpack nested tuples. It may be convenient to do this as a pre-processing pass, just prior to Cgen; with the above change we can't have a
Def
with multiple args any more.Fine: don't pollute
Def
; instead define a local data type just for code-gen purposes. It's a local, tactical issue concerned with how to make a modular code generator, and should only have local consequences.