masak / alma

ALgoloid with MAcros -- a language with Algol-family syntax where macros take center stage
Artistic License 2.0
139 stars 15 forks source link

Implement a JavaScript backend #166

Open masak opened 8 years ago

masak commented 8 years ago

I've run into some situations (for example with #116) where the speed of Runtime.pm becomes a nuisance. (Programs running hours instead of seconds, for example.)

At first I thought "C backend", but that runs into fairly interesting impedance match when it comes to closures. As far as I can see, it'd take non-trivial work to make 007 code translate to C.

But JavaScript is fairly fast and JITted nowadays! It's also much much closer to 007's semantics. I can even see how emitting JavaScript code might be fun.

Making sure Int behaves well will be one small challenge. I think a reasonable thing to do is to decide that 007's Val::Int are signed 32-bit. And then just make sure that we cover all the cases of overflow in the JavaScript routines.

Oh! The JavaScript backend would need its own setting, of course.

Strings, arrays, objects — all these would behave similarly enough. There might be some differences in UTF-8 handling if one looks hard enough, I guess.

Perhaps the biggest piece of actual thinking/design might be what happens in the new process boundary that forms (between Perl 6 compilation and JavaScript runtime). Consider constants for example:

constant FOO = weird_and_complicated_computation_that_results_in_a_sub();

This will have to code-gen somehow into assigning a function into FOO in the appropriate scope. Might have to code-walk the Val::Sub that results... which we can do in the compiler, but it'd be a new thing to have to do.

If we do this, in #67 we can have "Run" buttons under the code snippets, and correctly brag that 007 is running in people's browsers! :joy:

masak commented 8 years ago

Oh! The JavaScript backend would need its own setting, of course.

Oh oh! And because in 007 there's no indirect access, we can implement tree shaking without any adverse side effects.

vendethiel commented 8 years ago

Shifting gear onto an actual platform, huh ? Nice :-)

masak commented 8 years ago

Hah, I wonder which year was the first one JavaScript could be referred to as "an actual platform". :smile:

In any case, please don't think of this as 007 shifting gears, necessarily. The overriding goal is still the same: be an excellent macro/slang workbench for Perl 6. It's just that some experiments and demonstrations aren't so easily done on a slow/interpreted backend. JavaScript is compatible enough to be a very reasonable compilation target, and v8 is fast enough to do some new things.

Specifically, I'd like for this script to run fast. How long does it take on your box? :stuck_out_tongue:

vendethiel commented 8 years ago

My computer is internetless for the moment, I'll have to report in a week.

masak commented 8 years ago

Identifiers will need munging a little bit, because 007 allows some identifiers that JavaScript doesn't. (For example, infix:<+>.)

My first thought for solving this was to escape the string into something family-friendly, like infix_x3A_x3C_x2B_x3E. (And then solving the very-rare collisions that possibly arise because someone chose that name in 007, etc.)

But I think it's conceptually a lot easier to just go with names like _gen_1, _gen_2, etc, for things that JavaScript can't represent. Better yet, keeping track of which one is the lowest unmentioned _gen_N number guarantees it to be conflict-free.

It makes it a little bit harder to reconstruct that _gen_1 actually meant infix:<+> in the original source. But if that ends up being a concern at all, we could easily emit JS that looked like this:

let _gen_1;    // infix:<+>
masak commented 8 years ago

A deeper issue occurs when we're actually messing with identifier scoping, as we're doing with macros and quasis.

Consider the following example code:

my c;
BEGIN {
    c = 0;
}

macro moo() {
    my x = (c = c + 1);
    return quasi {
        say(x);
    }
}

{
    moo();
    moo();
}

This works today in 007 and prints 1\n2\n.

What JavaScript code should be emitted for the morally equivalent behavior? The immediate problem is that the scope where x is declared and the scope where it is used are completely separate.

Luckily, global variables reach into all scopes, and there's none of the normal adverse effects of globals, since the emitter keeps track of everything. I think the JS code would be something like this:

let _gen_1 = 1;     // 'x' in moo() call #1
let _gen_2 = 2;     // 'x' in moo() call #2

let say = ...;      // built-in

(() => {            // IIFE
    say(_gen_1);
    say(_gen_2);
})();

A little bit like functions can close over variables in surrounding scopes, quasi blocks can do the same, but making it work for quasi blocks requires code-generating global variables.

All this seems a little bit involved, but at least it's a solution, which is better than I feared.

masak commented 8 years ago

Another fun one: since JavaScript doesn't do arity checking, we'll have to code-gen it. I'm thinking we might want to code-gen it on the caller side, before the call. But we'll see what feels best.

masak commented 8 years ago

Oh wait. Can't do it on the caller side. Or rather, in all the cases when we can, we could also just fail at compile time with a nice error. So need to do it on the callee side.

Possibly there are some cases where we can see that the callee doesn't escape, and it's always called with the right arity, and we can remove the check. But that's not necessary for a first go at this.