tc39 / ecma262

Status, process, and documents for ECMA-262
https://tc39.es/ecma262/
Other
14.99k stars 1.28k forks source link

Built-in Modules #395

Open bterlson opened 8 years ago

bterlson commented 8 years ago

Built-in modules come up repeatedly around the various proposals. I am making this issue to centralize the discussion such that champions of the eventual proposal have a good central location for information. I will keep this up-to-date if there is further discussion in this issue.

Existing Discussion

This option entails establishing a naming convention for built-in modules. Has to be compatible with existing ecosystems, eg. we cannot clobber an npm or standard node package name.

Strawman: *-sigil

import "*SIMD";
import {float32x4} from "*SIMD";
import SIMD from "*SIMD";

Strawman: URL scheme

import { float32x4 } SIMD from "std:SIMD";
Distinct syntax for built-in module imports

This option necessitates additional reflective capabilities in the Loader API to request built-in modules by name as opposed to going through the normal module resolution process.

Strawman: IdentifierName instead of StringLiteral

import SIMD;
import {float32x4} from SIMD;
import SIMD from SIMD;

Semantics Requirements

Must defer to loader to resolve built-in modules (important for polyfillability). Loader may see either a special string or possibly an options record indicating that a built-in module is requested.

dherman commented 8 years ago

Just to gather a few constraints:

Edit: I see @bterlson already mentioned the exposing-to-the-loader constraint, apologies for the dup.

zenparsing commented 8 years ago

@bterlson Thanks for starting this discussion. I think it's very important for TC39 to set some kind of precedent here since platforms (e.g. DOM) will also likely want to start putting things into their own "standard" modules.

To throw another strawman out there, could we not also use a URI scheme for this purpose?

import { float32x4 } SIMD from "std:SIMD";

I'm concerned about this form:

import SIMD;

since it would presumably have non-local effects (by adding properties to the global object). If the bindings are local instead, I would reject to it on the same grounds as import * from.

bterlson commented 8 years ago

(Updated OP with new information)

I'm concerned about this form: import SIMD;

The way I see it this form is simply a shortcut for import SIMD from SIMD. There are no properties added to the global, SIMD is bound in the module environment record as normal imports are.

ljharb commented 8 years ago

To clarify the polyfill scenario: I must be able to write code that can mutate, overwrite, create, or freeze, a built-in module, just like I can do right now with a built-in global. Mutation/overwriting is for when engines inevitably ship bugs, and shims want to fix them - freezing is for things like SES that want guarantees that nobody can maliciously mutate/overwrite builtin modules later - creating is to provide new modules in older environments.

I also agree that whatever precedent we set should pave a cowpath for engines to add non-language-builtin builtin modules in a non-colliding way, but that ensure the same capabilities I mentioned in the previous paragraph.

zenparsing commented 8 years ago

Also, the IdentifierName variant

import {float32x4} from SIMD;

would be future-hostile to lexical modules, FWIW. And in general, I think it's "lexically surprising" to see an identifier in that position referring to something that's not in scope.

The way I see it this form is simply a shortcut for import SIMD from SIMD

For consistency, users need to write that as:

import * as SIMD from "<whatever>";

We should try not to special-case or give special meaning to forms which import built-ins.

zenparsing commented 8 years ago

Sorry, or

import SIMD from "<whatever>";

if it exports an default.

domenic commented 8 years ago

And in general, I think it's "lexically surprising" to see an identifier in that position referring to something that's not in scope.

I agree with this.

Of the positions so far I like a "std:" prefix the most. It reuses an existing part of the module resolution space (absolute URLs) in a way that can't conflict (std: is not a valid URL scheme today).

caridy commented 8 years ago

@ljharb

To clarify the polyfill scenario: I must be able to write code that can mutate, overwrite, or freeze, a built-in module, just like I can do right now with a built-in global. Mutation/overwriting is for when engines inevitably ship bugs, and shims want to fix them - freezing is for things like SES that want guarantees that nobody can maliciously mutate/overwrite builtin modules later.

You can't really mutate a module or a named export you are importing, in the case of modules, it is likely to be just a shimming process via export and export-from syntax. We have all the pieces in place to support this use-case today.

dherman commented 8 years ago

@caridy Maybe @ljharb just means replace the registry entry, which definitely is a requirement for polyfilling. (This is why it was important that even though modules cannot themselves be mutated from the outside, we still made it possible to mutate the registry.)

ljharb commented 8 years ago

Exactly that, yes ^ sorry that wasn't clear.

allenwb commented 8 years ago

Another problem with the IdentifierName syntax for module specifiers is that it is hostile to existing tools that already parse import statements. Keeping the string syntax means that tools that don't care about the actual semantics of the import don't need to change.

allenwb commented 8 years ago

@ljharb Regarding "freezing", any actual object values exported by a module can certainly be frozen using the usual techniques.

allenwb commented 8 years ago

From a core language perspective, I think very little would have to be said about the semantics of supporting built-module shimming. Basically, an import of a module identified using a built-in module designator that is recognized by an implementation uses the built-in implementation unless the active module loader has explicitly registered an interest in handling that module. Unrecognized built-in modules and those that the loader has registered for over-rides are simply passed on the the loader.

Loader APIs of course have to provide for registering built-in over-rides. But a simple implementation that only supports a single built-in loader doesn't need to even worry about that case.

allenwb commented 8 years ago

BTW, I also like: "std:SIMD" as long as we are confident that we can safely use "std:" without tripping over any other URI protocol.

I only threw out "SIMD" (with an escapable ) as a strawman in anticipation that there might be concerns about conflicts with things like "std:".

ljharb commented 8 years ago

@allenwb i meant, freezing it in the registry so further changes to "what gets imported" are impossible, ie, not just Object.freeze on the export, but Module.freeze on the registry entry, or similar.

tracker1 commented 8 years ago

I would only hope that whatever the implementation, that consumption/publication of CJS/Node and ES6 modules can interoperate... perhaps following browserify and webpack's logic in this regard.

Ir at least some awareness...

allenwb commented 8 years ago

@ljharb

i meant, freezing it in the registry

ok, then it seems like an orthogonal issue in the design of the module loader API and really doesn't impact the idea of specifying a way to designate standard built-in modules.

ljharb commented 8 years ago

I would think that for SES purposes, the ability to lock down a builtin module from being replaced in the registry is a blocker - @erights?

allenwb commented 8 years ago

Let me try to be clearer. In the absence of a standardized or implementation provided module loader that exposes a module registry there is no way to replace (or lock down) a built-in module. So, from the perspective of the core semantics of import this is a non-issue and shouldn't stand in the way of specifying built-in modules.

Certainly browsers and most other significant implementation will provide such capabilities so the module loader specifications needs to address it. But it shouldn't be a blocker that forces us to avoid defining built-in modules.

bterlson commented 8 years ago

I also like the module specifier "std:SIMD". IANA doesn't have any "std" scheme registered which is a good sign but of course there could be unregistered usage out there.

nbdd0121 commented 8 years ago

What about using :SIMD instead, then we don't have to worry about "std" being a valid scheme.

erights commented 8 years ago

Just esthetically, the leading colon looks pleasing to me. And as @nbdd0121 observes, we don't need to worry about the empty string becoming a valid scheme name.

rossberg commented 8 years ago

Of course, the obvious scheme would be "js:" or "es:".

allenwb commented 8 years ago

Do any known file systems assign meaning to a leading ":" in file/path/device names?

Over course, if we are worried about running into that, we could do "::" escaping like I suggested for "*".

Regardless, I like the explicitness of intent we would get with "std:".

paultyng commented 8 years ago

Perhaps the scheme, whatever it was, could be optional for use in the case of resolving ambiguity?

ljharb commented 8 years ago

Please no - optional things cause ambiguity and would present a refactoring hazard if you suddenly added another import that made it ambiguous. Whatever format is decided, it should be always required.

lars-t-hansen commented 8 years ago

Do any known file systems assign meaning to a leading ":" in file/path/device names?

Over course, if we are worried about running into that, we could do "::" escaping like I suggested for "*".

The old Mac OS uses ":" as the path name separator. Names are absolute by default, a leading ":" makes them relative, multiple leading ":" walks up the file system tree, eg, ":foo" is in the cwd, "::foo" is in the parent, etc.

More here, under Classic Mac OS

annevk commented 8 years ago

The problem with ":test" is that parsed against "https://example.com/" you'd get "https://example.com/:test". This is not a huge problem since we outlawed identifiers that do not start with "/", "./", or "../", but it's still somewhat surprising. If you'd decide to go down that route I'd vote for just using "test" since it's identical from a processing perspective and looks much better.

msegado commented 8 years ago

Given that much of ECMAScript's syntax is already C-like, what about just adopting the C/C++ preprocessor syntax to identify built-in modules?

Strawman: C++ style scheme:

import <SIMD>;
import {float32x4} from <SIMD>;
import SIMD from <SIMD>;

...For reference, here's the syntax used in C++:

#include <vector>  // a standard library header
#include <experimental/filesystem>  // an experimental standard library header
#include "mylibrary.h"  // a user-defined header file
nbdd0121 commented 8 years ago

@msegado I disagree. In C/C++, header name tokens are special case to the preprocessor, it is be part of tokenization. I believe introducing the syntax to ECMAScript will make parser even more complicated.

jeffmo commented 8 years ago

@msegado: This was brought up somewhere deep in the nest of comments above, but one requirement we have is the ability for users to easily polyfill new built-in modules in the future.

Because of this requirement, we're left with the need to use the same syntactic space as userland modules -- so coming up with some conventional string pattern that fits neatly in with our loader-string specs is probably the best path forward.

msegado commented 8 years ago

@jeffmo My apologies, I should have read the comments in more detail! Yes, that makes perfect sense; it's probably not worth introducing new syntax if it needs to resolve to a string in the loader anyway, and complicating the loader with separate treatment for builtins doesn't seem worthwhile.

AlicanC commented 8 years ago

Is this for ES only? Should Node.js (or browsers when they decide to stop polluting global) put their own built-ins under std?

zenparsing commented 8 years ago

@AlicanC Platforms should definitely not put their built-ins under std (we don't want it to become the new global object namespace). Hopefully though, the naming convention that we choose for ES built-ins would be usable for platforms. Perhaps:

import * as FS from "node:fs";
AlicanC commented 8 years ago

If platforms are to figure out their own names, then more than "std" will be susceptible to collision with IANA listings in the future.

I think a whole concept of module namespaces should be introduced and the namespace splitter should be something that makes the ModuleSpecifier an invalid URI, not :. Then the spec can use the "std" namespace itself and probably reserve others for future use.

// Importing File
import MyComponent from './MyComponent.js';
import Q from 'https://mycdn.com/q.js';

// Importing Module from Global (Root?) Namespace 
import Q from 'q';

// Importing Module in a Sub-Namespace
import SIMD from 'std^SIMD';
// or
import SIMD from 'std::SIMD';

I say ^ or :: because they make the ModuleSpecifier an invalid URI (right?) and eliminate the risk of any collisions. So if a platform wants to have a namespace called "http", it can.

zenparsing commented 8 years ago

@AlicanC There's probably some benefit to using syntactically valid URLs (something that can be used with new URL(...) for instance).

I agree that there's a theoretical problem with IANA scheme collision, but I'm not sure that it will be a problem in practice. It's certainly not a problem for Node. For the browser, maybe someone involved with HTML standards would like to offer an opinion? @domenic ?

AlicanC commented 8 years ago

@zenparsing @domenic I would really like to see the HTML spec define its own set of built-in modules and make the new features only available under those. (Just like the "new features only for https" thing.)


Even if we didn't have any collision concerns, I would still think that there should be a clear distinction between importing a path and a name. Is there really a good reason to make every ModuleSpecifier URL-parsable?


Also, would you like to standardize a way for specifying versions so we can actually make breaking changes and have opt-in modern APIs with specs that are not stuck in the early 90s?

import SIMD from 'std::simd';
import SIMD2 from 'std::simd@2';
import DOM6 from 'html::dom@6';
domenic commented 8 years ago

I agree that there's a theoretical problem with IANA scheme collision, but I'm not sure that it will be a problem in practice. It's certainly not a problem for Node. For the browser, maybe someone involved with HTML standards would like to offer an opinion? @domenic ?

There's no problem here. The schemes with actual behavior are well-defined by Fetch, and std is not one of them.

@zenparsing @domenic I would really like to see the HTML spec define its own set of built-in modules and make the new features only available under those. (Just like the "new features only for https" thing.)

There's very little motivation to do this. Globals have served the web platform well so far, and starting to make people go through extra hoops for new features doesn't really give us anything besides an inconsistent platform.

Also, would you like to standardize a way for specifying versions so we can actually make breaking changes and have opt-in modern APIs with specs that are not stuck in the early 90s?

This has been an antipattern on the web. Versioning specifiers like <!DOCTYPE html> or "use strict" cause engines to have to maintain two parallel separate mode implementations, which is a burden much worse than maintaining a compatible API. (That's why in other cases, e.g. <svg version="x">, the version specifier is completely ignored by the browser.)

erights commented 8 years ago

This has been an antipattern on the web. Versioning specifiers like <!DOCTYPE html> or "use strict" cause engines to have to maintain two parallel separate mode implementations, which is a burden much worse than maintaining a compatible API.

I agree with the general point you make here, but not as applies to "use strict". The non-antipattern that has emerged on the web can only cope with growing and compatible standards. This is why, in the simplicity dimension, standards can generally only get worse over time. ES3 was a mess -- it didn't even have lexical scoping. Functions were not really encapsulated. I could go on. If we had to build the future of JavaScript on sloppy ES3 we would not have gotten very far. "use strict" is an amazing and rare thing: a successful subtractive effort by a standards body that may not break its customer's code.

I also agree with the literal point you make. The mode switch was a burden for engines. But the pain was worth it to rescue JavaScript from the ES3 mess.

domenic commented 8 years ago

While I understand your position in general, I think you'll find a variety of opinions on whether it was worth it.

erights commented 8 years ago

I am certain of that there are a variety of opinions about this!

rossberg commented 8 years ago

With 1JS the value of strict mode was vastly diminished, perhaps even negated. One of them was a mistake, YMMV which one -- the net result is a high combinatorial complexity cost for fairly little benefit.

On 20 March 2016 at 00:48, Mark S. Miller notifications@github.com wrote:

I am certain of it!

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/tc39/ecma262/issues/395#issuecomment-198810270

erights commented 8 years ago

Note that 1js stops at module and class boundaries, whose bodies are always necessarily strict. Both modules and classes are such attractive abstraction mechanisms that they may eventually dominate new code.

But yes, as you know, I agree (as I think you now do) that the 1js approach of introducing new features into both sloppy and strict was a mistake. Sloppy mode should have been kept to its original purpose -- an ES3 compatibility mode. We had no good reason to impose on ourselves the complexity burden of adapting the new features to somehow appear in sloppy code. At least we stopped this insanity at module and class boundaries.

On Mon, Mar 21, 2016 at 12:02 PM, rossberg-chromium < notifications@github.com> wrote:

With 1JS the value of strict mode was vastly diminished, perhaps even negated. One of them was a mistake, YMMV which one -- the net result is a high combinatorial complexity cost for fairly little benefit.

On 20 March 2016 at 00:48, Mark S. Miller notifications@github.com wrote:

I am certain of it!

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/tc39/ecma262/issues/395#issuecomment-198810270

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/tc39/ecma262/issues/395#issuecomment-199426425

Cheers, --MarkM

nbdd0121 commented 8 years ago

domenic states that there's no problem using std scheme, so are we going to use std:name or still trying to use an invalid URL, such as std::name?

gibson042 commented 8 years ago

It looks like you're reinventing URNs. If that is the case, why not just use them?

zenparsing commented 8 years ago

@nbdd0121 I believe that foo::bar is a syntactically valid URL (where the path component is ":bar"). So I don't think the double colon helps anything.

graingert commented 7 years ago

what about

import {float32x4} from "https://www.ecma-international.org/simd";
ghost commented 7 years ago

@graingert I don't think it is particularly practical for a standard library to be so verbose. This might remind many of the verbosity of older DTDs, who most people just learned to copy-and-paste. I'm sure most people wouldn't want to go back to something like that.

If foo::bar or foo:bar are valid URLs already, a more attractive choice could be to consider other sigils than :.

graingert commented 7 years ago

@kdex it's not so bad for w3 apis:

import fetch, {Response, Request} from "https://w3.org/fetch";
graingert commented 6 years ago

You can't just squat schemes like that.