Possible new paths configuration scheme

jorendorff commented 11 years ago

Nobody's terribly fond of the ondemand table. What if instead:

System.ondemand = {"*": "*.js"};   // this is the default

System.ondemand = {
    "jquery": "/scripts/jquery-1.9.1.min.js",
    "dherman/*": "https://github.com/dherman/*.js",
    "ember/*": "/scripts/ember.js"
};

Here we've got three different kinds of entries:

if I import "jquery", it'll be fetched from a given URL as a module body.
if I import "dherman/task", it'll be fetched from https://github.com/dherman/task.js, again as a module body.
if I import "ember/cows", it'll be fetched from /scripts/ember.js, but because the LHS contains a * and the RHS doesn't, the loader expects that to be a script with multiple modules in it.

This doesn't support all possible ways you might want to bundle stuff, but you can do the more complicated stuff with a custom resolve hook, and it's not even hard.

samth commented 11 years ago

Conversation with @dherman and @jorendorff confirms that we like this.

jorendorff commented 11 years ago

ISTR more recent discussion on this but can't find it.

So we're dropping bundle support from the Loader internals. Scripts located in bundles will now be addressed using special address strings, either in a forthcoming zip URL syntax (to be determined) or something effectively equivalent but only for the browser System loader.

I think that makes the wildcards even more attractive:

    "ember/*": "/scripts/ember.zip%!*.js"

Awesome. Separately, I think @dherman likes calling this System.paths rather than .ondemand.

guybedford commented 11 years ago

I like the concepts here a lot as well. I think it makes a lot of sense to allow bulk path mappings like this.

My worry is that "*" feels like a lot of people will expect more detailed glob patterns out of this.

Perhaps one could let the trailing slash itself indicate that we are mapping all subpaths, while a lack of a trailing slash would indicate just this module ID only.

Then the initial examples become:

System.paths = {
    "jquery": "/scripts/jquery-1.9.1.min.js",
    "dherman/": "https://github.com/dherman/"
    "ember/": "/scripts/ember.zip%!"
};

(using a paths name instead of ondemand)

The only case this doesn't work with is the other example of having all ember subpaths map to a single file. But given the consideration of archive-based URLs replacing module syntax, this example would no longer make sense, so the above would work out quite nicely.

The concepts of archive-based URLs sound great to me as well. The closer bundling can be moved out to the network layer, the better.

Always happy to discuss further.

jorendorff commented 11 years ago

My worry is that "*" feels like a lot of people will expect more detailed glob patterns out of this.

@dherman was a little worried about that too. Can you give a specific example of the sort of thing that people might expect to work, that wouldn't work?

guybedford commented 11 years ago

It's difficult to know what people might think, so not sure how valid this is. But one example would be that it might not be immediately clear that

  "ember/*": "/scripts/ember.zip%!*.js"

will apply to all deep subpaths including say ember/nested/module. So it might be expected that this should be indicated by something like ember/**/* for example.

jorendorff commented 11 years ago

It's true, * only globs one path segment in unix. But I think programmers will just learn it. Not a huge obstacle.

I don't think people will write patterns like embar/**/* expecting them to work, and be disappointed. That doesn't work in most places where wildcards are used.

Maybe it would be better to pick a different character for the wildcard? I don't think so because using (say) % or @ won't make that distinction any more intuitively obvious.

Using a trailing slash might be better, but it doesn't allow you to add the .js extension. (And I don't think we want to "just add it for you"; much better for that to be present in the configuration.)

guybedford commented 11 years ago

Sure, % and @ are useful in URLs anyway. The trailing slash method could still work if the js extension adding applied to trailing slash paths only, and not exact module name paths. That said, I do like the original syntax and would be more than happy with this.

guybedford commented 11 years ago

The more I think about it the more I am really happy with this wildcard mapping system. The character seems like the only thing needing to be confirmed here. Wondering if you think it's worth implementing this in the polyfill yet, or are things still being discussed?

jorendorff commented 11 years ago

Things are still being discussed.

The choice you face is a tricky question any time one implements ES features “ahead of” the finished specification. I find it tricky as I try to get this stuff implemented in Firefox. Here are the facts:

Whatever you (or I) implement is certain to change somewhat. It is impossible to predict the magnitude of the changes.
Your choice has consequences. Experience using a feature may help prove it, increasing the probability that it “sticks” (i.e. gets committee consensus and becomes standardized). Or it may discover problems with it.

My best guess: implementing System.paths would be good for modules, good for JS, and good for your users in the short term (after all, you've got to configure the loader somehow). But users should be warned that it is likely to change.

jorendorff commented 11 years ago

You may want to wait until Monday or Tuesday though. There was a TC39 meeting last week. I wasn't there but I hear there were some discussions about making loader configuration easier. I will post here as soon as I know more.

unscriptable commented 11 years ago

Hey guys,

I know I'm coming in to the middle of a conversation, so I'm missing some of the context, here. Hope you don't mind if I mention/ask a few things. :)

IMHO, the loader does not belong in the ECMAScript spec at all. It's the run-time environments's job to find modules! If anything, we should be providing guidance to whatwg. (@dherman easily convinced me of this last week after one of the TC39 meetings!)

We can only guess what the environment will implement, so creating a meta-language for mapping ids to paths seems unproductive. Therefore, I was hoping we'd leave bundles out of the spec altogether.

From another angle:

@jorendorff's example of the default path lookup: System.ondemand = {"*": "*.js"}; is seductively simple (and I like it). However, it's fairly ambiguous and begs lots of questions. Furthermore, it looks like it is just describing the behavior of the built-in resolve step of the System loader's pipeline, which, iirc, just ensures a ".js" extension.

This made me realize that the best way to describe complex path mapping operations is not with a meta-language, it's with a JavaScript function. And we already have this function: Loader.prototype.resolve.

Lastly, how does this fit with node.js, vert.x, RingoJS, etc.? These environments have their own scheme for translating module ids to paths that doesn't seem (imho) to fit well with any meta-language that I can envision atm.

So why specify path/bundle mappings at all in the spec or reference implementations?

-- John

jrburke commented 11 years ago

@unscriptable for me, it seems fairly obvious, particularly with the work AMD loaders have done around common config, that there are some useful, declarative config forms. The resolve hook overriding can work for cases that want fancier code, but I would not want to see an outcome where front end projects need to include a loader.js file that sets up this stuff by default. Then we are back to script loader scripts again for common cases.

I have questions about adequately supporting the equivalent of package config with this type of config language, and for AMD loaders we opted not to use a string language like this to keep it simple, but I would rather have some declarative config in there than nothing. But sounds like waiting for the most TC39 fallout would be useful.

unscriptable commented 11 years ago

with the work AMD loaders have done around common config, that there are some useful, declarative config forms...

The problem is that nobody agrees on the config. AMD has a core "common config", but there are extensions amongst implementations. Ember.js uses something different, iiuc, and browserify would be necessarily different, as well. There's no way we could come up with one declarative config form to rule them all. Therefore, we should just use a function.

[without a common config] we are back to script loader scripts again for common cases.

I disagree that we need loader scripts. Here's why:

Even the simplest web app will likely need something like this at a minimum:

<script>
var loader = new Loader({ baseUrl: '/client' });
loader.import('run');
</script>

With a single zip url, this could look like this:

<script src="myapp.zip"></script>
<script>
var loader = new Loader({ baseUrl: '/myapp.zip%!' });
loader.import('run');
</script>

If the dev is using cujoJS, ember, dojo, etc. then they would very likely place the framework-specific code (including framework ES6 loader overrides) inside their run module. Therefore, the code would still look like this:

<script src="myCujoApp.zip"></script>
<script>
var loader = new Loader({ baseUrl: '/myCujoApp.zip%!' });
loader.import('run');
</script>

I have questions about adequately supporting the equivalent of package config with this type of config language...

This sounds like an argument for leaving the configuration out of the spec to me. :)

jrburke commented 11 years ago

The problem is that nobody agrees on the config. AMD has a core "common config", but there are extensions amongst implementations. Ember.js uses something different, iiuc, and browserify would be necessarily different, as well. There's no way we could come up with one declarative config form to rule them all. Therefore, we should just use a function.

I see it differently. Since AMD loaders agreed on a common config indicates those concepts are commonly useful. Of course some people will want to do more, and there is a resolve hook for that. But many cases are adequately satisfied with the existing declarative common config concepts.

browserify is not the best example in this discussion: it does not need ID-to-path resolution, it bundles modules by ID and retrieves them by those IDs. For build time bundling, it uses Node's resolution logic, which due to its multiple, nested IO lookups, would use a resolve hook. They can still do that and any declarative config supported natively in the loader would not interfere.

If the dev is using cujoJS, ember, dojo, etc. then they would very likely place the framework-specific code (including framework ES6 loader overrides) inside their run module. Therefore, the code would still look like this:

For me, if every one of those frameworks has that common config, then that is extra code delivered in all cases, and it indicates something the platform should handle. Bundling in a zip or as part of a built JS file does not seem like a much of a mitigation. It still means needing a third party library assist to get into ES module loading.

I have questions about adequately supporting the equivalent of package config with this type of config language... This sounds like an argument for leaving the configuration out of the spec to me. :)

I was trying to express: if the string config in this ticket could not account for what is possible in package config (main module ID working slightly differently than sub-IDs), then the string config needs an adjustment, since we found, via agreeing on common config for AMD loaders, that it is useful.

The main overriding point:

We have already done some field testing via AMD loaders for some declarative config. We found some common concepts that were useful to express. Those are likely to also be applicable in the ES module world, particularly since AMD has been designed to work well in a networked environment like the browser, where multiple file IO lookups for an ID are not a good idea. It is the harder environment to work in.

Any declarative config also does not take anything away from anyone wanting fancier config via a resolve override. It only helps avoid code for many common project layouts.

guybedford commented 11 years ago

@unscriptable this paths implementation is part of the "System" browser loader, which is not part of the ES6 Loader spec, but the separate browser loader spec. So it is in a separate specification as far as I am aware (or at least in the process of being).

The benefit of having a System loader that works well "out of the box" is that users can get going with modules without needing to be in a world of custom loaders. The more the spec can do the better, as @jrburke says, we have enough real world knowledge now to know the common config needs (map, paths, packages, shim).

Just because we can't agree on "all" config, doesn't mean we don't agree on a good amount of standard config. Personally I think "map" and "paths" should both be in the System loader.

The baseURL for a loader can be set without needing to create a new loader instance:

  System.baseURL = '/myapp.zip%!';
  System.import('run');

guybedford commented 10 years ago

I'd be interested to hear if this is still being considered? Would quite like to start prototyping this to test it out if it is.

samth commented 10 years ago

Yes, I believe this is still the plan, in the browser loader implementation.

unscriptable commented 10 years ago

Hey guys,

I can see how we could have a config that maps module ids to urls. However, I still don't see how we can have a config that maps module ids to bundles. Last I heard, a bundle implies "script" semantics since there is no way to include several inline modules together. Is this still true? Does anybody have a link to the proposal for bundles / bundle format?

Thanks!

-- John

guybedford commented 10 years ago

@unscriptable sorry to hijack, but I'm interested to debate the original syntax here. I'm still very interested to hear the response to your question, I hope someone can answer this.

@jorendorff I'm still not sure I understand the added benefit of the wildcard syntax.

Rather the same functionality can be captured without wildcards quite easily:

  System.locate = function(load) {
    // a URL already -> just use
    if (load.name.match(urlRegEx))
      return load.name;

    // now substitute paths prefixes
    for (var p in System.paths) {
      if (load.name.substr(0, p.length) == p) {
        var nextChar = load.name.charAt(p.length);
        // part of a path -> substitute then treat just like baseURL resolution
        if (nextChar == '/')
          return System.paths[p] + loader.name.substr(p.length) + '.js';
        // whole path match -> use the exact path given
        else if (nextChar == '')
          return System.paths[p];
      }
    }

    // otherwise just use baseURL (allow backtracking below baseURL too)
    return resolve(baseURL, loader.name) + '.js';
  }

The above use cases all become:

System.paths = {
    "jquery": "/scripts/jquery-1.9.1.min.js",
    "dherman": "https://github.com/dherman",
    "ember": "/scripts/ember.zip%!"
};

Surely if there are two ways to do the same thing, one with a more advanced syntax, and another with a simpler syntax, the simpler option should be used?

What use case am I missing here that wildcards allow over such a system as above?

guybedford commented 10 years ago

Two suggestions from the discussion in https://github.com/jorendorff/js-loaders/issues/100#issuecomment-32390302:

It is possible to entirely replace baseURL by paths. We then resolve paths themselves as URLs relative to the current page if not absolute.
Leaving out the .js in paths makes them much simpler to read and write, without losing any flexibility.

Example standard configuration:

<script>
  System.paths['*'] = 'lib/*';
  System.paths['app/*'] = 'app/*';
</script>

jorendorff / js-loaders

Possible new paths configuration scheme #25