tc39 / proposal-defer-import-eval

A proposal for introducing a way to defer evaluate of a module
https://tc39.es/proposal-defer-import-eval
MIT License
219 stars 12 forks source link

Deferred keys and weakening early error timing #54

Open guybedford opened 1 week ago

guybedford commented 1 week ago

The current proposal as specified treats namespace keys as known at the time of execution deferral so that all instantiation errors have already been thrown, and all async work has been done. All early errors thus happen at "import defer time". And touching the namespace object, ie "namespace evaluation time" only ever gives execution errors.

With the PR in https://github.com/tc39/proposal-defer-import-eval/pull/53 is labelled "Hide .then", it actually does a lot more than this - it makes keys only available after execution.

This late key listing hasn't been explicitly and publicly motivated strongly, and needs to be. As far as I can tell:

  1. There is a desire for keys to be found late because that is the semantics of JS bundling on CommonJS today
  2. There is a hope that late keys might allow defer to also defer named exports validations and instantiation validations

We need to much more clearly dig into these two points above since they are guiding the proposal design right now.

Specifically to delve deeper into (2), this involves a weakening of the design of the spec, where it is potentially entirely a host decision when it gives errors at defer time or at namespace evaluation time, over all of:

Having all of the above errors happen potentially at different times in browsers versus Node.js-like environments seems to me to introduce a little more variation than one would hope for in a standard.

If we are to move in this direction it would help to understand what execution framework it can fit in that isn't just a weakening of existing guarantees.

For this reason I think we need to explicitly call out this change in behaviour of this specification at plenary as a normative change in direction under Stage 2.7.

nicolo-ribaudo commented 1 week ago

Hey, thanks for opening this as a new issue rather than in the PR.

  1. There is a desire for keys to be found late because that is the semantics of JS bundling on CommonJS today

Yes (not just to CommonJS). It's the unfortunate reality that most code used on the web is transpiled and bundled, and that even though 20% of page loads today use native ESM it's usually still a single (or, in some cases, few) ESM file emitted by bundlers.

If, at least for the time being (probably until when we have module declarations), many production websites are not going to be using the native browser implementation, we need to keep tools into account in the feature design: we want the JavaScript that people use to be as close as possible to the JavaScript that the spec defines, and we want people to have to worry about "how JavaScript behaves" and not "how JavaScript behaves in this project with this build process, vs how it behaves in this other project with this other build process".

Concretely, there are two types of tools:

The former would transform the import defer to a "sync import" triggered by the proxy trap, depending on the target platform (for example, require(esm)/require(cjs) in Node.js, or importNow() in XS). For these tools its simply impossible to know ahead-of-time the list of exports of a module.

Note that these tools usually do not support top-level await, but that has been the case since when top-level await has been released. Importantly, it just throws an explicit error (ideally at build time, with the expectation that you are passing all your files through the tool, but it might happen at runtime if you are only passing some of them): when there is a difference in what can be implemented in different platforms, it's much better when the difference is "it works in one, it errors in the other" rather than "it does X in one and Y in the other".

This was mentioned as half of the motivation for triggering execution on any [[Get]] access in the 2024-04 meeting (search for "tool-friendly"), with the other half being https://github.com/tc39/proposal-defer-import-eval/issues/19. Unfortunately I hadn't notice at the time how inconsistent just that change was:

For "tools that bundle all the files together", keep reading as it fits in the next category.

  1. There is a hope that late keys might allow defer to also defer named exports validations and instantiation validations [...] where it is potentially entirely a host decision when it gives errors at defer time or at namespace evaluation time

There might have been a misunderstanding here. By exposing less information in the spec we make it easier to optimize/defer more than what the spec requires by still it being unobservable, so that a host decision (which optimizations to apply) can be done without affecting what JS code can see.

More specifically, platforms that know that their code will not change between executions and that can perform synchronous loading would be able to completely skip loading of deferred graphs that transitively have no syntax errors and no top-level await (i.e. the majority). This is one bit of information to keep track of per module. It is not impossible for these environments to keep around the list of exports, it's just much more metadata (going from one bit per module to a list of strings per module).

Some examples of what these environments are include JS cloud platforms (where the code does not change between two "deploys"), browsers when fetching code from a local cache, desktop&mobile JS apps, self-contained executables such as Node.js SEA, and maybe even local Node.js development with tools like Yarn PnP that don't let users modify installed packages.

This was presented as a goal in the 2023-07 presentation for Stage 2.

"tools that bundle all the files together" are very close to this category: they cannot actually skip parsing when they put everything in a single file, but:

The misunderstanding probably comes from one of our recent Module Harmony meetings, where I mentioned that Node.js cannot by default apply these "extra optimizations" because people are free to change files between executions, and so Node.js would have to check the whole graph for changes anyway. I suggested that maybe Node.js could have something like a package.json flag or import attribute "trust me this is sync and does not have syntax errors" that, instead of returning a Source Text Module Record, would return a Module type that moves all the loading/linking/evaluation steps of the underlying Source Text Module Record to its evaluation phase (as if it did require(esm) while evaluating). This is not something that needs to happen as part of this proposal, and maybe it doesn't need to happen at all: it's fine if not all platforms can optimize the same way, given that they all have different constraints.


Anyway, I 100% agree that this change needs to be brought up in plenary.

guybedford commented 1 week ago

If the goal is to support precompiled environments deferring evaluation work, it's worth noting many precompiled environments benefit from the opposite - compile-time evaluation work (eg Fastly's JS SDK).

That said, we need to be very clear then that Node.js is not a precompiled environment and absolutely cannot use defer to defer loading or early errors in any form.

If we implement or specify this badly, and Node.js implements something like that in future then that leads to the concern above that all early errors become non deterministic.

The benefit of deferring key listing thus in reality only affects JS bundles today and that makes sense. But the point stands we must be very very clear about these constraints.

nicolo-ribaudo commented 1 week ago

I think https://github.com/tc39/proposal-defer-import-eval/pull/53 still clearly does not report linking errors "at evaluation", but we could make it clear that they are not deferred by adding a node either in .Link() or in EnsureDeferredNamespaceEvaluation() making it explicit.

In the end we cannot prevent platforms from defining their own types of module records (for example, regardless of this proposal they could define a hypothetical Source Text Metadata Module Record that exports a boolean telling whether there are syntax errors, and a function that when called will either throw those syntax errors or evaluates the module). However, we can make it clear that it's not what the import defer semantics prescribe.

it's worth noting many precompiled environments benefit from the opposite - compile-time evaluation work (eg Fastly's JS SDK).

Last time we talked about this I remember that the conclusion was that for platforms like Fastly's SDK the best approach is to:

This keeps the same ordering that import defer currently has in other platforms, while still evaluating everything so that you can cache the partially evaluated state.

This is not affected by any of the proposed changes, right?

guybedford commented 6 days ago

The concern here is about the deferred keys specifically, because in all scenarios for this spec, the key list should be known and early errors should have been thrown.

By making the keys deferred, we're effectively hinting that named exports early errors may be possible to skip. And that's what's being implemented in the polyfill.

But for the standard implementations we very much need to ensure this is not the case.