Open bterlson opened 8 years ago
Just to gather a few constraints:
Edit: I see @bterlson already mentioned the exposing-to-the-loader constraint, apologies for the dup.
@bterlson Thanks for starting this discussion. I think it's very important for TC39 to set some kind of precedent here since platforms (e.g. DOM) will also likely want to start putting things into their own "standard" modules.
To throw another strawman out there, could we not also use a URI scheme for this purpose?
import { float32x4 } SIMD from "std:SIMD";
I'm concerned about this form:
import SIMD;
since it would presumably have non-local effects (by adding properties to the global object). If the bindings are local instead, I would reject to it on the same grounds as import * from
.
(Updated OP with new information)
I'm concerned about this form:
import SIMD;
The way I see it this form is simply a shortcut for import SIMD from SIMD
. There are no properties added to the global, SIMD is bound in the module environment record as normal imports are.
To clarify the polyfill scenario: I must be able to write code that can mutate, overwrite, create, or freeze, a built-in module, just like I can do right now with a built-in global. Mutation/overwriting is for when engines inevitably ship bugs, and shims want to fix them - freezing is for things like SES that want guarantees that nobody can maliciously mutate/overwrite builtin modules later - creating is to provide new modules in older environments.
I also agree that whatever precedent we set should pave a cowpath for engines to add non-language-builtin builtin modules in a non-colliding way, but that ensure the same capabilities I mentioned in the previous paragraph.
Also, the IdentifierName variant
import {float32x4} from SIMD;
would be future-hostile to lexical modules, FWIW. And in general, I think it's "lexically surprising" to see an identifier in that position referring to something that's not in scope.
The way I see it this form is simply a shortcut for import SIMD from SIMD
For consistency, users need to write that as:
import * as SIMD from "<whatever>";
We should try not to special-case or give special meaning to forms which import built-ins.
Sorry, or
import SIMD from "<whatever>";
if it exports an default.
And in general, I think it's "lexically surprising" to see an identifier in that position referring to something that's not in scope.
I agree with this.
Of the positions so far I like a "std:"
prefix the most. It reuses an existing part of the module resolution space (absolute URLs) in a way that can't conflict (std:
is not a valid URL scheme today).
@ljharb
To clarify the polyfill scenario: I must be able to write code that can mutate, overwrite, or freeze, a built-in module, just like I can do right now with a built-in global. Mutation/overwriting is for when engines inevitably ship bugs, and shims want to fix them - freezing is for things like SES that want guarantees that nobody can maliciously mutate/overwrite builtin modules later.
You can't really mutate a module or a named export you are importing, in the case of modules, it is likely to be just a shimming process via export
and export-from
syntax. We have all the pieces in place to support this use-case today.
@caridy Maybe @ljharb just means replace the registry entry, which definitely is a requirement for polyfilling. (This is why it was important that even though modules cannot themselves be mutated from the outside, we still made it possible to mutate the registry.)
Exactly that, yes ^ sorry that wasn't clear.
Another problem with the IdentifierName syntax for module specifiers is that it is hostile to existing tools that already parse import
statements. Keeping the string syntax means that tools that don't care about the actual semantics of the import
don't need to change.
@ljharb Regarding "freezing", any actual object values exported by a module can certainly be frozen using the usual techniques.
From a core language perspective, I think very little would have to be said about the semantics of supporting built-module shimming. Basically, an import of a module identified using a built-in module designator that is recognized by an implementation uses the built-in implementation unless the active module loader has explicitly registered an interest in handling that module. Unrecognized built-in modules and those that the loader has registered for over-rides are simply passed on the the loader.
Loader APIs of course have to provide for registering built-in over-rides. But a simple implementation that only supports a single built-in loader doesn't need to even worry about that case.
BTW, I also like: "std:SIMD" as long as we are confident that we can safely use "std:" without tripping over any other URI protocol.
I only threw out "SIMD" (with an escapable ) as a strawman in anticipation that there might be concerns about conflicts with things like "std:".
@allenwb i meant, freezing it in the registry so further changes to "what gets imported" are impossible, ie, not just Object.freeze
on the export, but Module.freeze
on the registry entry, or similar.
I would only hope that whatever the implementation, that consumption/publication of CJS/Node and ES6 modules can interoperate... perhaps following browserify and webpack's logic in this regard.
Ir at least some awareness...
@ljharb
i meant, freezing it in the registry
ok, then it seems like an orthogonal issue in the design of the module loader API and really doesn't impact the idea of specifying a way to designate standard built-in modules.
I would think that for SES purposes, the ability to lock down a builtin module from being replaced in the registry is a blocker - @erights?
Let me try to be clearer. In the absence of a standardized or implementation provided module loader that exposes a module registry there is no way to replace (or lock down) a built-in module. So, from the perspective of the core semantics of import
this is a non-issue and shouldn't stand in the way of specifying built-in modules.
Certainly browsers and most other significant implementation will provide such capabilities so the module loader specifications needs to address it. But it shouldn't be a blocker that forces us to avoid defining built-in modules.
I also like the module specifier "std:SIMD"
. IANA doesn't have any "std" scheme registered which is a good sign but of course there could be unregistered usage out there.
What about using :SIMD
instead, then we don't have to worry about "std" being a valid scheme.
Just esthetically, the leading colon looks pleasing to me. And as @nbdd0121 observes, we don't need to worry about the empty string becoming a valid scheme name.
Of course, the obvious scheme would be "js:" or "es:".
Do any known file systems assign meaning to a leading ":" in file/path/device names?
Over course, if we are worried about running into that, we could do "::" escaping like I suggested for "*".
Regardless, I like the explicitness of intent we would get with "std:".
Perhaps the scheme, whatever it was, could be optional for use in the case of resolving ambiguity?
Please no - optional things cause ambiguity and would present a refactoring hazard if you suddenly added another import that made it ambiguous. Whatever format is decided, it should be always required.
Do any known file systems assign meaning to a leading ":" in file/path/device names?
Over course, if we are worried about running into that, we could do "::" escaping like I suggested for "*".
The old Mac OS uses ":" as the path name separator. Names are absolute by default, a leading ":" makes them relative, multiple leading ":" walks up the file system tree, eg, ":foo" is in the cwd, "::foo" is in the parent, etc.
The problem with ":test" is that parsed against "https://example.com/" you'd get "https://example.com/:test". This is not a huge problem since we outlawed identifiers that do not start with "/", "./", or "../", but it's still somewhat surprising. If you'd decide to go down that route I'd vote for just using "test" since it's identical from a processing perspective and looks much better.
Given that much of ECMAScript's syntax is already C-like, what about just adopting the C/C++ preprocessor syntax to identify built-in modules?
Strawman: C++ style scheme:
import <SIMD>;
import {float32x4} from <SIMD>;
import SIMD from <SIMD>;
...For reference, here's the syntax used in C++:
#include <vector> // a standard library header
#include <experimental/filesystem> // an experimental standard library header
#include "mylibrary.h" // a user-defined header file
@msegado I disagree. In C/C++, header name tokens are special case to the preprocessor, it is be part of tokenization. I believe introducing the syntax to ECMAScript will make parser even more complicated.
@msegado: This was brought up somewhere deep in the nest of comments above, but one requirement we have is the ability for users to easily polyfill new built-in modules in the future.
Because of this requirement, we're left with the need to use the same syntactic space as userland modules -- so coming up with some conventional string pattern that fits neatly in with our loader-string specs is probably the best path forward.
@jeffmo My apologies, I should have read the comments in more detail! Yes, that makes perfect sense; it's probably not worth introducing new syntax if it needs to resolve to a string in the loader anyway, and complicating the loader with separate treatment for builtins doesn't seem worthwhile.
Is this for ES only? Should Node.js (or browsers when they decide to stop polluting global) put their own built-ins under std
?
@AlicanC Platforms should definitely not put their built-ins under std
(we don't want it to become the new global object namespace). Hopefully though, the naming convention that we choose for ES built-ins would be usable for platforms. Perhaps:
import * as FS from "node:fs";
If platforms are to figure out their own names, then more than "std" will be susceptible to collision with IANA listings in the future.
I think a whole concept of module namespaces should be introduced and the namespace splitter should be something that makes the ModuleSpecifier an invalid URI, not :
. Then the spec can use the "std" namespace itself and probably reserve others for future use.
// Importing File
import MyComponent from './MyComponent.js';
import Q from 'https://mycdn.com/q.js';
// Importing Module from Global (Root?) Namespace
import Q from 'q';
// Importing Module in a Sub-Namespace
import SIMD from 'std^SIMD';
// or
import SIMD from 'std::SIMD';
I say ^
or ::
because they make the ModuleSpecifier an invalid URI (right?) and eliminate the risk of any collisions. So if a platform wants to have a namespace called "http", it can.
@AlicanC There's probably some benefit to using syntactically valid URLs (something that can be used with new URL(...)
for instance).
I agree that there's a theoretical problem with IANA scheme collision, but I'm not sure that it will be a problem in practice. It's certainly not a problem for Node. For the browser, maybe someone involved with HTML standards would like to offer an opinion? @domenic ?
@zenparsing @domenic I would really like to see the HTML spec define its own set of built-in modules and make the new features only available under those. (Just like the "new features only for https" thing.)
Even if we didn't have any collision concerns, I would still think that there should be a clear distinction between importing a path and a name. Is there really a good reason to make every ModuleSpecifier URL-parsable?
Also, would you like to standardize a way for specifying versions so we can actually make breaking changes and have opt-in modern APIs with specs that are not stuck in the early 90s?
import SIMD from 'std::simd';
import SIMD2 from 'std::simd@2';
import DOM6 from 'html::dom@6';
I agree that there's a theoretical problem with IANA scheme collision, but I'm not sure that it will be a problem in practice. It's certainly not a problem for Node. For the browser, maybe someone involved with HTML standards would like to offer an opinion? @domenic ?
There's no problem here. The schemes with actual behavior are well-defined by Fetch, and std is not one of them.
@zenparsing @domenic I would really like to see the HTML spec define its own set of built-in modules and make the new features only available under those. (Just like the "new features only for https" thing.)
There's very little motivation to do this. Globals have served the web platform well so far, and starting to make people go through extra hoops for new features doesn't really give us anything besides an inconsistent platform.
Also, would you like to standardize a way for specifying versions so we can actually make breaking changes and have opt-in modern APIs with specs that are not stuck in the early 90s?
This has been an antipattern on the web. Versioning specifiers like <!DOCTYPE html>
or "use strict"
cause engines to have to maintain two parallel separate mode implementations, which is a burden much worse than maintaining a compatible API. (That's why in other cases, e.g. <svg version="x">
, the version specifier is completely ignored by the browser.)
This has been an antipattern on the web. Versioning specifiers like <!DOCTYPE html> or "use strict" cause engines to have to maintain two parallel separate mode implementations, which is a burden much worse than maintaining a compatible API.
I agree with the general point you make here, but not as applies to "use strict". The non-antipattern that has emerged on the web can only cope with growing and compatible standards. This is why, in the simplicity dimension, standards can generally only get worse over time. ES3 was a mess -- it didn't even have lexical scoping. Functions were not really encapsulated. I could go on. If we had to build the future of JavaScript on sloppy ES3 we would not have gotten very far. "use strict" is an amazing and rare thing: a successful subtractive effort by a standards body that may not break its customer's code.
I also agree with the literal point you make. The mode switch was a burden for engines. But the pain was worth it to rescue JavaScript from the ES3 mess.
While I understand your position in general, I think you'll find a variety of opinions on whether it was worth it.
I am certain of that there are a variety of opinions about this!
With 1JS the value of strict mode was vastly diminished, perhaps even negated. One of them was a mistake, YMMV which one -- the net result is a high combinatorial complexity cost for fairly little benefit.
On 20 March 2016 at 00:48, Mark S. Miller notifications@github.com wrote:
I am certain of it!
— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/tc39/ecma262/issues/395#issuecomment-198810270
Note that 1js stops at module and class boundaries, whose bodies are always necessarily strict. Both modules and classes are such attractive abstraction mechanisms that they may eventually dominate new code.
But yes, as you know, I agree (as I think you now do) that the 1js approach of introducing new features into both sloppy and strict was a mistake. Sloppy mode should have been kept to its original purpose -- an ES3 compatibility mode. We had no good reason to impose on ourselves the complexity burden of adapting the new features to somehow appear in sloppy code. At least we stopped this insanity at module and class boundaries.
On Mon, Mar 21, 2016 at 12:02 PM, rossberg-chromium < notifications@github.com> wrote:
With 1JS the value of strict mode was vastly diminished, perhaps even negated. One of them was a mistake, YMMV which one -- the net result is a high combinatorial complexity cost for fairly little benefit.
On 20 March 2016 at 00:48, Mark S. Miller notifications@github.com wrote:
I am certain of it!
— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/tc39/ecma262/issues/395#issuecomment-198810270
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/tc39/ecma262/issues/395#issuecomment-199426425
Cheers, --MarkM
domenic states that there's no problem using std
scheme, so are we going to use std:name
or still trying to use an invalid URL, such as std::name
?
It looks like you're reinventing URNs. If that is the case, why not just use them?
@nbdd0121 I believe that foo::bar
is a syntactically valid URL (where the path component is ":bar"). So I don't think the double colon helps anything.
what about
import {float32x4} from "https://www.ecma-international.org/simd";
@graingert I don't think it is particularly practical for a standard library to be so verbose. This might remind many of the verbosity of older DTDs, who most people just learned to copy-and-paste. I'm sure most people wouldn't want to go back to something like that.
If foo::bar
or foo:bar
are valid URLs already, a more attractive choice could be to consider other sigils than :
.
@kdex it's not so bad for w3 apis:
import fetch, {Response, Request} from "https://w3.org/fetch";
You can't just squat schemes like that.
Built-in modules come up repeatedly around the various proposals. I am making this issue to centralize the discussion such that champions of the eventual proposal have a good central location for information. I will keep this up-to-date if there is further discussion in this issue.
Existing Discussion
Syntax Options
Naming convention inside module specifier
This option entails establishing a naming convention for built-in modules. Has to be compatible with existing ecosystems, eg. we cannot clobber an npm or standard node package name.
Strawman: *-sigil
Strawman: URL scheme
Distinct syntax for built-in module imports
This option necessitates additional reflective capabilities in the Loader API to request built-in modules by name as opposed to going through the normal module resolution process.
Strawman: IdentifierName instead of StringLiteral
Semantics Requirements
Must defer to loader to resolve built-in modules (important for polyfillability). Loader may see either a special string or possibly an options record indicating that a built-in module is requested.