Turn the builtins into a .007 file; compile and cache it

masak commented 8 years ago

The builtins currently get installed in quite a clumsy way:

method load-builtins {
    my $opscope = $!builtin-opscope;
    for builtins(:$.input, :$.output, :$opscope) -> Pair (:key($name), :$value) {
        my $identifier = Q::Identifier.new(
            :name(Val::Str.new(:value($name))),
            :frame(NONE));
        self.declare-var($identifier, $value);
    }
}

...by iterating through a list of pairs and running declare-var on each of them.

Experience with debugging that bit of 007 (which often happens when frames and scopes break in some unpleasant way) tells me that it's not at all useful that the builtins get manually declared in this way. To be honest, I'd prefer it if they were just miraculously available somehow.

(Also, declare-var contains some frame-displacing logic that we don't use at all.)

What we probably want is to simply replace the "setting pad" with the right stuff, all in one simple assignment. What's the "setting pad"? In _007::Runtime's BUILD, we create a new artificial block, and enter it:

    my $setting = Val::Block.new(
        :parameterlist(Q::ParameterList.new),
        :statementlist(Q::StatementList.new),
        :static-lexpad(Val::Object.new),
        :outer-frame(NO_OUTER));
    self.enter($setting);

Shortly after, we load the builtins in that painstaking way.

But .enter has created a frame for us, and that frame has a pad:

    my $frame = Val::Object.new(:properties(:$block, :pad(Val::Object.new)));
    @!frames.push($frame);

What we should be doing is simply put a Val::Object there which, as nearly as possible, already comes pre-filled with all the setting stuff.

Which tells me Builtins.pm's actual job should be to create this Val::Object and return it to the runtime. Not, as it currently does, return a list of pairs of things for the runtime to put together bit by bit, like IKEA furniture.

It would perhaps be fun to build such a version of Builtins.pm... once. But the fact is that we already have a component that can build pads for us: the compiler. All we need to do is create a 007 version of the builtins.

Some of the built-in subs (and methods) can probably even be expressed as 007 code! say and prompt can't, but min and max certainly can (update: these are now gone), and probably a fair number of the operators. For those that can't, I propose using some special kind of block, like ⦃...⦄, to embed Perl 6 code nicely into builtins.007. (And then try really hard to make sure that this mechanism can only be used on that file.) Then we parse through everything, and finally extract the Val::Object that was generated, and serialize it to Builtins.pm.

This should happen at bin/007 startup, I think. Or some similarly early place. Builtins.pm should be considered a projection, and if it's blown away or made obsolete by a newer builtins.007, things should still work in the sense that a new better Builtins.pm will just be written behind the scenes without any fuss. Of course, the lower the extra startup cost for just checking this, the better.

masak commented 8 years ago

This should happen at bin/007 startup, I think. Or some similarly early place. Builtins.pm should be considered a projection, and if it's blown away or made obsolete by a newer builtins.007, things should still work in the sense that a new better Builtins.pm will just be written behind the scenes without any fuss. Of course, the lower the extra startup cost for just checking this, the better.

Or how about this: add a test that compiles the built-ins and compares the result against Builtins.pm. Source-control Builtins.pm. That way, there's zero startup cost, and any discrepancy between builtins.007 and Builtins.pm will be caught by prove most of the time and by Travis at the latest.

Yes, I like that a lot better than compiling at startup. I keep forgetting that we have a good solution for mitigating the drawbacks of code duplication.

masak commented 8 years ago

I'm also thinking we might leave the normal grammar alone, and subclass to shove in the extra rules. Feels like that'd also make abuse of the ⦃...⦄ parsing (or accidental contact with it in any way) a lot less likely.

masak commented 8 years ago

add a test that compiles the built-ins and compares the result against Builtins.pm.

This method also — I think; this kind of hurts to consider in depth — gets us out of a bootstrapping problem that I hadn't anticipated at all. Namely that, the more realistic builtins.007 gets, the more the Val and Q types will be declared purely in that file. But these types absolutely need to be in place when the parser gets going... so how would we parse builtins.007?

This is a non-issue if we load Builtins.pm from the parser, which can then parse builtins.007, which can then (among other things) generate a fresh Builtins.pm when needed. At most, we'll have to be a bit careful when introducing features that bootstrap; might have to carefully split up changes into several commits or such.

masak commented 8 years ago

Having started down the path of this change in local code, I can now state that initiating the built-in opscope is

currently all tangled up with initiating the built-in lexical scope, but
going to have to be a separate thing, probably a separate exported sub (builtins-opscope) also returning a fully-formed data structure.

The good news is that this looks very doable. Even better, it feels like a best-of-both worlds kind of thing: we get to mix together these two concerns in the source file (builtins.007 or equivalent), but we get to have them separately in the target file (Builtins.pm). And tests will make sure they never diverge.

masak commented 7 years ago

We can do this in stages:

[x] magically assign all the contents of the setting pad instead of going through self.declare-var
[ ] initialize the pad as much as possible in one go (possibly creating repetitive code in the process)
- this can be done gradually, builtin by builtin
[ ] create something that can compile the builtins from 007 to the Perl 6 pad object

masak commented 7 years ago

Here are some loose timings for how long it takes right now to populate the built-ins:

$ perl6 -e'my $sum = [+] lines; say $sum / 100' timings-master-with-builtins
0.39727
$ perl6 -e'my $sum = [+] lines; say $sum / 100' timings-master-no-builtins
0.36627

(8% time saving on the master branch, just from not loading the builtins at all.)

$ perl6 -e'my $sum = [+] lines; say $sum / 100' timings-refactor-with-builtins
0.50946
$ perl6 -e'my $sum = [+] lines; say $sum / 100' timings-refactor-no-builtins
0.38738

(24% time savings on the #242 branch. That's quite a lot; might account for at least some of the slowdown we're seeing on that branch.)

It strikes me that one perhaps simple way to reduce the startup cost for the builtins is simply to run all the builtin-building code at BEGIN time. Then it's actually not so critical anymore to try to inline and optimize that code, as it'll be a one-time compilation cost when building 007.

But I suspect that'll also run us straight into the rakudobug where subs don't survive well in constants. (RT #127089)

masak / alma

Turn the builtins into a .007 file; compile and cache it #185