tc39 / proposal-compartments

Compartmentalization of host behavior hooks for JS
MIT License
122 stars 11 forks source link

Eager or Lazy import.meta construction #80

Open kriskowal opened 2 years ago

kriskowal commented 2 years ago

At the September 21, 2022 SES Strategy call we discussed three options for timing the construction of import.meta for a virtual module.

  1. Prior (before Module construction): The ModuleSource instance must have a public needsImportMeta property that indicates whether the source contains AST nodes for either import.meta or eval, and the module handler passed to Module(source, handler) must have a pre-built importMeta property that will be used by that module if it ever evaluates an import.meta expression.
  2. Eager (on the stack of Module construction): The Module Source Record behind a ModuleSource instance must have a [[NeedsImportMeta]] internal slot that indicates to the Module(source, handler) constructor that it must call handler.importMetaHook() to construct an import.meta and capture it on the Module instance’s [[ImportMeta]] internal slot before the constructor returns.
  3. Lazy (upon first evaluation) current spec text: The module handler must have an importMetaHook() that will be called upon the first evaluation of an import.meta expression, capturing [[ImportMeta]] on the Module instance internal slot for every subsequent evaluation of import.meta.

Commentary:

Lazy may or may not pose either a reentrance hazard. The virtual module is vulnerable to faulty arrangement by whomever constructs the Module regardless, but the question is whether there is undue risk if we allow the host to execute arbitrary code in the context of the virtual module.

In particular, of the participants of the SES call, @nicolo-ribaudo and @erights agree that we take as a guiding principle in this design that hosts and virtual hosts should be given equivalent power, so if a host may execute on the stack of import.meta and also may have side-effects as a consequence of evaluating import.meta, it stands to reason that a virtual host should be allowed the same. However, if 262 constrains hosts to have no observable side-effects upon import.meta, virtual hosts mustn’t be able to execute arbitrary code at that time.

Whether the import meta object is constructed at all is not equivalent between these two proposals. There are cases where eager is necessarily over-eager, since it must speculate on whether eval will result in evaluation of import.meta. We doubt that the difference in behavior will be germane to the performance motivations for lazy construction on Node.js. There will be very few cases where a module uses either import.meta or eval, much fewer for both.

The choice between eager and prior construction is a matter of developer ergonomics.

ljharb commented 2 years ago

I'm still very confused why we'd want to expose, pre-evaluation, whether a module references import.meta or not. Why wouldn't we want to deterministically assume every module needs import.meta?

erights commented 2 years ago

Because for most modules it is trivially statically apparent that they do not, and there is benefit to being able to avoid creating it for the vast majority of modules that do not need it.

ljharb commented 2 years ago

Can't it be represented by, instead of an object, a thunk? ie, a function that's called the first time it's needed, creates and returns the object, and cached forever after that?

kriskowal commented 2 years ago

Can't it be represented by, instead of an object, a thunk? ie, a function that's called the first time it's needed, creates and returns the object, and cached forever after that?

How does this differ from Lazy (option 3)?

ljharb commented 2 years ago

ah, i suppose that's the same thing then, hmm :-/

the information i'd prefer to avoid exposing is "whether a module uses import.meta" - otherwise that becomes a part of the module's API, and starting to use it, or no longer using it, becomes a breaking change - where currently, it's not.

erights commented 2 years ago

Lazy has the problem that user code then interleaves at the point of evaluating the import.meta expression, which is at least surprising, whereas eager only interleaves when the module is created, which is a user-code-interleaving point anyway.

As for API fragility, eager makes the dependency only on that static text of the module, whereas lazy makes it depend on what happens dynamically at runtime, which can be data dependent.

ljharb commented 2 years ago

Right - but the static text of the module is, just like the source of a function, not supposed to be reliably observable from without, for the same reasons.

kriskowal commented 2 years ago

I’m sympathetic to both interleaving (Lazy) and breaking change hazards (Prior, Eager). Can I ask @erights and @ljharb to contrive concrete examples?

The interleaving hazard takes the form below, where the source is able to tease importMetaHook into calling on its stack with some bad effect. That might involve throwing from the hook or catching around import.meta in the source, or stack inspection. We may have to choose whether to protect the invariant that import.meta cannot currently throw in any real environment.

const s = new ModuleSource(/* ... */);
const m = new Module(s, { importMetaHook() { /* ... */ } });

The breaking change hazard will always consist of a pair of versions of the same module source, one with and the other without import.meta. I presume a game rule for this kind of hazard is that working code must turn into broken code in the presence of a defect-free virtualized module system. For a “Prior” style module system to be correct, it must provide an importMeta in new Module(source, { importMeta }) if source.needsImportMeta is true, for whatever contractual obligations exist between the source and the host for import.meta (which vary between hosts).

I imagine there’s another class of hazard, where something about this design increases the probability of a defect in a virtualized module system. That’s worth talking about too.

Jamesernator commented 2 years ago

the information i'd prefer to avoid exposing is "whether a module uses import.meta" - otherwise that becomes a part of the module's API, and starting to use it, or no longer using it, becomes a breaking change - where currently, it's not.

Right - but the static text of the module is, just like the source of a function, not supposed to be reliably observable from without, for the same reasons.

How is import.meta really different in this regard than from say an import ... from statement? Previously one could not directly observe imports from a module, however using the same module via Module will ultimately expose those previously hidden things.

Like a module could change it's imports and a loader would need to be prepared to handle that, so why would import.meta be more breaking in comparison?

ljharb commented 2 years ago

That’s a really good point, and a wider potential concern for this proposal.

devsnek commented 2 years ago

From a high level I'd be fine with either a lazy api or passing the object into the constructor. A hook that is just called immediately in the constructor seems a bit pointless/confusing in comparison. I'd also be against trying to be super clever about whether a module contains import.meta, 3rd parties can do their own analysis if they really care about that. I will note that node's vm api uses the lazy hook api already, and while that api is currently marked experimental (no stability guarantee) it would be nice to avoid changes where possible.

nicolo-ribaudo commented 2 years ago

A hook, even if called immediately, allows to properly replicate the behavior of every host. Just passing an object to the constructor doesn't allow, for example, to freeze import.meta or to define its [[Prototype]].