Open masak opened 8 years ago
This should happen at
bin/007
startup, I think. Or some similarly early place.Builtins.pm
should be considered a projection, and if it's blown away or made obsolete by a newerbuiltins.007
, things should still work in the sense that a new betterBuiltins.pm
will just be written behind the scenes without any fuss. Of course, the lower the extra startup cost for just checking this, the better.
Or how about this: add a test that compiles the built-ins and compares the result against Builtins.pm
. Source-control Builtins.pm. That way, there's zero startup cost, and any discrepancy between builtins.007
and Builtins.pm
will be caught by prove
most of the time and by Travis at the latest.
Yes, I like that a lot better than compiling at startup. I keep forgetting that we have a good solution for mitigating the drawbacks of code duplication.
I'm also thinking we might leave the normal grammar alone, and subclass to shove in the extra rules. Feels like that'd also make abuse of the ⦃...⦄ parsing (or accidental contact with it in any way) a lot less likely.
add a test that compiles the built-ins and compares the result against
Builtins.pm
.
This method also — I think; this kind of hurts to consider in depth — gets us out of a bootstrapping problem that I hadn't anticipated at all. Namely that, the more realistic builtins.007
gets, the more the Val
and Q
types will be declared purely in that file. But these types absolutely need to be in place when the parser gets going... so how would we parse builtins.007
?
This is a non-issue if we load Builtins.pm
from the parser, which can then parse builtins.007
, which can then (among other things) generate a fresh Builtins.pm
when needed. At most, we'll have to be a bit careful when introducing features that bootstrap; might have to carefully split up changes into several commits or such.
Having started down the path of this change in local code, I can now state that initiating the built-in opscope is
builtins-opscope
) also returning a fully-formed data structure.The good news is that this looks very doable. Even better, it feels like a best-of-both worlds kind of thing: we get to mix together these two concerns in the source file (builtins.007
or equivalent), but we get to have them separately in the target file (Builtins.pm
). And tests will make sure they never diverge.
We can do this in stages:
self.declare-var
Here are some loose timings for how long it takes right now to populate the built-ins:
$ perl6 -e'my $sum = [+] lines; say $sum / 100' timings-master-with-builtins
0.39727
$ perl6 -e'my $sum = [+] lines; say $sum / 100' timings-master-no-builtins
0.36627
(8% time saving on the master branch, just from not loading the builtins at all.)
$ perl6 -e'my $sum = [+] lines; say $sum / 100' timings-refactor-with-builtins
0.50946
$ perl6 -e'my $sum = [+] lines; say $sum / 100' timings-refactor-no-builtins
0.38738
(24% time savings on the #242 branch. That's quite a lot; might account for at least some of the slowdown we're seeing on that branch.)
It strikes me that one perhaps simple way to reduce the startup cost for the builtins is simply to run all the builtin-building code at BEGIN
time. Then it's actually not so critical anymore to try to inline and optimize that code, as it'll be a one-time compilation cost when building 007.
But I suspect that'll also run us straight into the rakudobug where subs don't survive well in constants. (RT #127089)
The builtins currently get installed in quite a clumsy way:
...by iterating through a list of pairs and running
declare-var
on each of them.Experience with debugging that bit of 007 (which often happens when frames and scopes break in some unpleasant way) tells me that it's not at all useful that the builtins get manually declared in this way. To be honest, I'd prefer it if they were just miraculously available somehow.
(Also,
declare-var
contains some frame-displacing logic that we don't use at all.)What we probably want is to simply replace the "setting pad" with the right stuff, all in one simple assignment. What's the "setting pad"? In
_007::Runtime
'sBUILD
, we create a new artificial block, and enter it:Shortly after, we load the builtins in that painstaking way.
But
.enter
has created a frame for us, and that frame has a pad:What we should be doing is simply put a
Val::Object
there which, as nearly as possible, already comes pre-filled with all the setting stuff.Which tells me
Builtins.pm
's actual job should be to create thisVal::Object
and return it to the runtime. Not, as it currently does, return a list of pairs of things for the runtime to put together bit by bit, like IKEA furniture.It would perhaps be fun to build such a version of
Builtins.pm
... once. But the fact is that we already have a component that can build pads for us: the compiler. All we need to do is create a 007 version of the builtins.Some of the built-in subs (and methods) can probably even be expressed as 007 code!
say
andprompt
can't, butmin
andmax
certainly can (update: these are now gone), and probably a fair number of the operators. For those that can't, I propose using some special kind of block, like ⦃...⦄, to embed Perl 6 code nicely intobuiltins.007
. (And then try really hard to make sure that this mechanism can only be used on that file.) Then we parse through everything, and finally extract theVal::Object
that was generated, and serialize it toBuiltins.pm
.This should happen at
bin/007
startup, I think. Or some similarly early place.Builtins.pm
should be considered a projection, and if it's blown away or made obsolete by a newerbuiltins.007
, things should still work in the sense that a new betterBuiltins.pm
will just be written behind the scenes without any fuss. Of course, the lower the extra startup cost for just checking this, the better.