Integrate sweet.js with ES6 modules

disnet commented 10 years ago

There are two papers on macros and modules that are probably relevant. Composable and Compilable Macros and Implicit Phasing for R6RS Libraries.

The basic idea of both papers is that you have to track the phase level any bit of syntax currently is in. The phase isn't as simple as compile-time vs. run-time because a case macro (compile-time) might need to import and use another macro (earlier compile-time) which might also need to use a macro (even earlier compile-time) and so on.

// bar.js
export macro bar {
    case {_ ()} => {
        // ...
    }
}

// foo.js
import { bar } from "bar"
export macro foo {
    case {_ () } => { 
        bar()
        // ... 
    }
}

// main.js
import { foo } from "foo"
foo()

The papers outline two approaches we could take. The first, following the "Composable and Compilable" paper and Racket, involves explicitly declaring what phase level an import is for.

import { $ } from "jquery"
import forSyntax { _ } from "underscore"

let m = macro {
    case { } => {
        $("#foo") // error: $ not bound
        _.map([1,2,3], function (x) { ... });
        // ...
    }
}
$("#foo") // works fine
_.map(...) // error: _ is not bound

This gets ugly though when you have a macro definition in your macro definition since you need to import up two phase levels:

import { $ } from "jquery"
import forSyntax { _ } from "underscore"
import forSyntax forSyntax { ld } from "lodash"
// or maybe following racket:
import meta 2 { ld } from "lodash"

let m = macro {
    case { } => {
        let n = macro {
            case {} => {
                ld.map(...)
                // ...
            }
        }
        $("#foo") // error: $ not bound
        _.map([1,2,3], function (x) { ... });
        // ...
    }
}
$("#foo") // works fine
_.map(...) // error: _ is not bound

The "Implicit Phasing" paper gives a technique for tracking all the phasing implicitly so you wouldn't need to do import forSyntax. The paper seems to suggest that there are no real trade offs in doing this but I seem to remember @dherman mention there was a good reason Racket stayed with explicitly phase labeling (something about the REPL maybe?).

I would prefer to keep the phasing implicit if possible since forcing macro authors to think about the details of module phasing seems not great.

natefaubion commented 10 years ago

One thing I've wondered is how we play well with the wealth of CommonJS modules, which aren't going to follow the ES6 import syntax, and don't need to be read/expanded/parsed/evaled, but are only needed for compile time utilities. I suppose we could just require within macros, but I just don't really like that, and would really like a way to seamlessly work with that ecosystem.

Edit: Since #227 has caching builtin, this isn't a big deal I suppose, since it's just aesthetics.

natefaubion commented 10 years ago

I've been poking around, trying to get a feel for how we should move forward with modules. Unfortunately, there's no easy, clear, or even straightforward way. I'd like to expand on @disnet's original post and offer some more details with pros and cons.

There are really two axes for module systems in the current landscape. One is explicit vs. implicit phase declaration, and the other is single vs. multiple module instantiation. In systems with single module instantiation, modules are only ever instantiated once and reused throughout each phase. While with multiple module instantiation, fresh instances of modules are created for each phase. In practice, module systems generally fall on opposite ends. Racket has explicit phasing with multiple instances, while Ikarus has implicit phasing with single instances.

Implicit Phasing

Implicit phasing generally has better ergonomics since you don't have to think about phases, but it has a few disadvantages. Primarily, the algorithm relies on lazily and progressively instantiating modules. This means that an import declared, but never used, is never instantiated. This is just not how any JS module system works (especially ES6 modules), so I think it would be very unexpected. The algorithm also relies on the absence of expansion-time side effects, which may be problematic for a lot of use cases.

Explicit Phasing

Explicit phasing, frankly, is just hard for anything remotely complicated. It's not necessarily a bad thing since macros are hard in general, but it is hard. The big advantage is that you know exactly when and in what phase you need everything, which is good for us as module implementors. It also makes module instantiation a bit simpler since it doesn't exist in two progressive states like implicit phasing.

Single Module Instantiation

The main advantage here is speed and resources. The problem is that you can get into portability issues if we ever seriously move into letting you load and expand modules in the browser (ie. not just a precompiler). By portability being an issue, I mean you can create two (potentially) very different expansions by precompiling vs compiling and running immediately.

For example, take something like _.uniqueId() from underscore. This is a very common, stateful function that can be used throughout expansion and runtime. It's implemented as a simple numeric counter (so not truly unique, but that's neither here nor there). If we were to precompile, the counter would be reset once you run the application. But if you compiled and ran it immediately (reusing the same module instance) the counter would be a continuation of it's state after expanding. This may not seem like a big deal, but you can (potentially) get obscure, hard-to-find bugs where your app will work in dev (by running immediately), and not work when precompiling for production! These are called cross-phase side effects.

Multiple Module Instantiation

This solves the portability issue above. For each phase, an entirely new heap with fresh globals and module instances is created. It's like separating each compile(-compile(-compile))-time into its own little universe. You don't have to worry about cross-phase effects since phases are isolated from each other (except through macro expansion). It's (potentially) expensive, however, because you also need to expand each module for each phase (expansion time side effects!). Needless to say, our expander is not very fast so this concerns me a bit.

Another disadvantage of this is backwards compatability with npm/CJS. For the time being, likely all macro utils are going to come from npm, and there's not a clear-cut way to create separate heaps while still hooking into builtin require type stuff. The only way I know of is to use vm.runInContext, but that means we will have to reimplement a lot of internal nodey stuff. Also, I'm not sure how we can deal with .node compiled modules since we have no way of running those in a new vm context.

So what do we do? Explicit phasing and single module instantiation are both the simplest implementation routes, but that leaves us with the worst of both worlds! You get (potentially) difficult declarations with cross-phase effects and portability problems. I also don't really see how implicit phasing (based off of this algorithm at least) is viable since it just does not gel with JS-land. That leaves us with explicit phasing and multiple module instances, which has its own set of difficulties regarding backwards compatibility.

FWIW, I think we can make the syntax for explicit phasing much better than highlighted above (not duplicating forSyntax). Borrowing from Racket, you could do something like:

import { foo, bar } form 'foo'; // phase 0
import { foo, bar } from 'foo' for macros; // phase 1
import { foo, bar } from 'foo' for templates; // phase -1
import { foo, bar } from 'foo' for meta 5; // arbitrary

But that's just surface-level stuff. Comments, suggestions, condolences are all welcome.

jlongster commented 10 years ago

I need to read more into this, but I'm totally fine with having some kind of import for macros form. I'm going to read the papers links in the original comment this week and I'd love to help out with this discussion (though many like @dherman are more qualified than me)

disnet commented 10 years ago

That definitely clarifies things. Agreed that explicit phrasing looks like the way to go. Wish we could have avoided it but as your example shows it's not the worst thing in the world. I have no insight about multiple instantiation and backwards compatibility right now but we definitely need to figure that out.

jayphelps commented 10 years ago

What's the latest thinking on this? AFAIK, sweet inside ES6 module-syntax breaks things since the export keyword is shared by the two? i.e. if you transpiler first, you lose sweet exports, do it after and you lose es6 module exports. Any workarounds?

natefaubion commented 10 years ago

What's the latest thinking on this? AFAIK, sweet inside ES6 module-syntax breaks things since the export keyword is shared by the two? i.e. if you transpiler first, you lose sweet exports, do it after and you lose es6 module exports. Any workarounds?

One of the big motivating factors of declarative modules in ES6 is that it opens the possibility of things like macros. We know when expanding whether an export is a macro transformer or a runtime expression. There's no reason why they can't coexist. The macro imports/exports just disappear at runtime. Transpiling from ES6 to CommonJS/AMD is another issue entirely, however.

am11 commented 10 years ago

On a related topic, I have few notes regarding V3 source-maps, which should probably be considered while planning this feature:

If your final output is to combine results of all the included modules, the sources: [] should include each file in the import-chain. Also the encoded mappings[] chunks should point to the correct index of the source in the sources array.

Now for instance, in CoffeeScript, the parser itself is not dependency-aware; the perks of module-pattern importing I suppose? Therefore, the compiler runner has to take care of all the dependencies, path resolutions for output and source-maps and make sure the final source map's mapping is in accord (in case of --join; which is apparently subjected to change in future).

In LESS/SASS, their compiler cores have this dependency context, so they produce quality output quickly and reliable source-maps (especially LESS; which produces most feature-complete source-maps).

grncdr commented 10 years ago

Implicit phasing ... relies on lazily and progressively instantiating modules. This means that an import declared, but never used, is never instantiated. This is just not how any JS module system works (especially ES6 modules), so I think it would be very unexpected. The algorithm also relies on the absence of expansion-time side effects, which may be problematic for a lot of use cases.

@natefaubion I may be missing something, but the "unexpectedness" of lazy module instantiation seems over-stated. Wouldn't the increased difficulty be more on the macro authors side than macro consumers? E.g. authors need to be sure that a macro doesn't depend on side-effects of caused by instantiating a separate module, and while that can be difficult, I don't think it's more difficult than authoring macros in general.

On the other hand, I am having a hard time imagining a situation where a macro user could be relying on compile-time side effects of import { someMacro } from 'some-module' without actually using someMacro in the code. As an aside: it should be very possible for sweet.js to warn them that the macro they've imported is unused, since that can be detected during compilation.

I also think we can make a useful distinction between cross-phase side-effects during compilation, and cross-phase effects from compile phases to the final runtime phase. Unless I'm misunderstanding (and that's very possible) there will alway be a "final phase" where the code get's passed off to either the JS runtime or written to a file on disk, and clearing all module instances at that point should be trivial.

natefaubion commented 10 years ago

ES6 supports import 'somemodule' syntax specifically for side effecting imports. We shouldn't arbitrarily change module semantics for compile time code.

grncdr commented 10 years ago

That's true, a special syntax is probably a good idea. I'm mostly concerned with avoiding explicit phasing if at all possible, as it pushes a lot more complexity on to macros consumers.

Given this property:

The macro imports/exports just disappear at runtime.

The syntax import macros ... clearly calls out that this is an "unusual" import, where defining special rules for macro imports should be less of an issue.

natefaubion commented 10 years ago

That's true, a special syntax is probably a good idea. I'm mostly concerned with avoiding explicit phasing if at all possible, as it pushes a lot more complexity on to macros consumers.

Explicit phasing only affects macro authors. If you are only using macros to generate run-time code (at any runtime phase) then it looks like any normal ES6 import. You only need to annotate a phase if you are 1) using a macro inside the macro code itself 2) referencing an import in a template.

import { fooMacro } from "foo" for macro;
import * as React from "react" for template;

macro component {
  case { ... } => {
    // ...
    // Here I'm using fooMacro in the macro code
    fooMacro { 

    };

    // ...
    // Here I'm exporting a template that references React
    return #{
      var $name = React.createClass({

      });
    }
  }
}

export component;

import { component } from "react-macros";

component Foo {

}

Expands to the runtime code:

import * as React from "react";

var Foo = React.createClass({

});

Phasing isn't that bad. The cognitive overhead is minimal for 98% of the use cases. You only get into higher meta stuff when you're doing crazy things like defining macros inside your macros that invoke other macros.

grncdr commented 10 years ago

Oh right, that's obvious in retrospect... thanks for being patient :) On Sep 12, 2014 5:30 PM, "Nathan Faubion" notifications@github.com wrote:

That's true, a special syntax is probably a good idea. I'm mostly concerned with avoiding explicit phasing if at all possible, as it pushes a lot more complexity on to macros consumers.

Explicit phasing only affects macro authors. If you are only using macros to generate run-time code (at any runtime phase) then it looks like any normal ES6 import. You only need to annotate a phase if you are 1) using a macro inside the macro code itself 2) referencing an import in a template.

import { fooMacro } from "foo" for macro;import * as React from "react" for template; macro component { case { ... } => { // ... // Here I'm using fooMacro in the macro code fooMacro {
};

// ...
// Here I'm exporting a template that references React
return #{
  var $name = React.createClass({

  });
}
}} export component;

import { component } from "react-macros"; component Foo { }

Expands to the runtime code:

import * as React from "react"; var Foo = React.createClass({ });

— Reply to this email directly or view it on GitHub https://github.com/mozilla/sweet.js/issues/233#issuecomment-55419351.

m1sta commented 10 years ago

Should the discussion around a potential a 'macro manager' and a bower style sweet install {macro name} command be included here or in a separate issue?

andreypopp commented 10 years ago

@m1sta I think macros can be distributed via npm as a part of regular packages

disnet commented 10 years ago

Exactly. The plan is to cleanly integrate with npm. Both to install macros but also to import and use any existing npm library in case macros.

vendethiel commented 10 years ago

Do we also want to have a begin-for-syntax equivalent?

disnet commented 10 years ago

Yep we probably should include begin-for-syntax. Any ideas on good syntax for it?

beginForSyntax {
    var id = function(x) { return x }
}
macro m {
    case {} => {
        var x = id(42);
        return #{42}
    }
}

forMacros {
    var id = function(x) { return x }
}
for macros {
    var id = function(x) { return x }
}
syntax {
    var id = function(x) { return x }
}

Not super stoked about any of these.

vendethiel commented 10 years ago

Well, considering begin-for-syntax might want a specifier on its own "meta level", we could see something like

meta 5 {
  import { ld } from "lodash";

  var i = 5;
}

instead?

elibarzilay commented 8 years ago

Some general random comments:

One of the major surprises of implicit phases and how it leads to "lazy loading": a practical result of that is that you can't really use toplevel non-definition expressions for side-effects. That's because you can't just require some library to execute these expressions. IIRC, the Ikarus approach was to have an init function for such things. Or something.
This has been a very long debate in R6RS. Personally, I find no problem with the explicit approach (but I'm probably biased too), but when I realized the above that seemed like a major problem which made my preference much stronger. If ES6 modules are expected to be executed when required, then it makes choosing implicit phases much more impractical.
Re the multiple instances vs single ones: I can only say that Racket's module system seemed completely crazy to be doing that. But the results -- the ability to write any code at any level -- is something that you shouldn't underestimate. I didn't even realize how significant it was until I had a discussion with a random schemer who claimed that doing "real things" at the macro level, like running a full parser, or invoking an external compiler, or fetching some resource from the web, all of these sounded completely crazy to that person. (And obviously, no -- you cannot just be careful -- since state is there in many unexpected cases, like creating a new type.)
The problem with multiple instances -- cases where you want a single instance -- are all cases where you're talking about low-level stuff like buffered IO, a single DB handle, etc. That is, cases where you're talking to the external world and need to face inherent state phase leakages. In practice there are few of these, and they are better dealt with at the same level you're dealing with foreign code.
So my subjective conclusion from all of this is that Racket leans towards the choices it made because (a) it views all languages as equal and implementable, and therefore it never wants to confuse an if at one level with an if at another (I vaguely remember stuff like this turning extremely confusing with an implicit system); (b) there's generally a decision to allow any tools at the macro level, and therefore multiple instances make much more sense.

disnet commented 8 years ago

Yep, I was long ago convinced the explicit/multiple instances approach is the right way to go for Sweet. Currently hacking on it as a matter of fact :)

Thanks for the additional insight!

elibarzilay commented 8 years ago

@disnet: I'm happy to hear that it was preaching to the choir! It would be interesting to know how you'd implement multiple instances of the same library -- looking at some loaders I got the general impression that it could be easy since they're basically doing it in the form of one big function.

Also, having modules that provide/require syntax in JS would be amazing IMO. Getting macros is one thing, but getting how to do them with a module system is a whole new level -- and that's exactly the thing that Racket did earlier than all other Schemes, eventually making it a language for implementing PLs.

dead-claudia commented 8 years ago

What's the status of this?

disnet commented 8 years ago

@isiahmeadows we have basic support implemented for importing a macro along with importing for syntax.

It's not full support by any means. It's only one phase level (no recursive imports) and only the import { named } form is supported.

I'm currently hacking on it though so full support will be coming Real Soon™️.

dead-claudia commented 8 years ago

Yay! :-)

On Tue, Sep 6, 2016, 14:37 Tim Disney notifications@github.com wrote:

@isiahmeadows https://github.com/isiahmeadows we have basic support implemented http://sweetjs.org/doc/1.0/tutorial.html#_sweet_modules for importing a macro along with importing for syntax.

It's not full support by any means. It's only one phase level (no recursive imports) and only the import { named } form is supported.

I'm currently hacking on it though so full support will be coming Real Soon ™️.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sweet-js/sweet.js/issues/233#issuecomment-245035401, or mute the thread https://github.com/notifications/unsubscribe-auth/AERrBG_UpcA7FPqSfzB-aoeToSwPw5g1ks5qnaoRgaJpZM4BeVjW .

sweet-js / sweet-core