Open aduh95 opened 2 years ago
Is this table for the current https://github.com/nodejs/node/pull/40250 PR, or a potential future where the assertions are optional?
I don’t think we should be doing much, if anything, between resolve
and load
. As a loader author, I would expect the return value of resolve
to be passed into load
without much happening in between.
I’m going to take a stab at a thinking-out-load proposal; feel free to point out its flaws or suggest a better alternative. Take as an example a YAML loader, to make possible code like this:
import data from './data.yaml' assert { type: 'yaml' }
For this case, the YAML loader’s load
hook would be where we assert that the data is actually YAML before converting it to JSON and returning { source: convertedData, format: 'json' }
, and then the json
translator would take it from there. Node would not do any validation of the assertion; the validation would have to happen in defaultLoad
, which in the case of a YAML loader would not have been called. In order for such a user loader to not be unnecessarily duplicative of Node’s internals, we would also need to make progress on https://github.com/nodejs/loaders/issues/26#issuecomment-918413522, making available some utility functions for some of the logic that Node does within defaultResolve
and defaultLoad
, so that this user loader could defer to existing Node logic for loading a file from disk but without needing to also do any assertion validation that defaultLoad
would do.
So anyway, in this model nothing happens before resolve
or between resolve
and load
, or after load
. The validation is done in defaultLoad
, if a user loader calls it or if no user loaders are present.
I think loader authors need the assertions info in both hooks, because resolve
might want to resolve to a different URL based on the assertion, and in load
a loader author might want to do a custom validation, which they would need the assertion for (such as this YAML example). If there was some way for authors to pass arbitrary data from resolve
into load
, that would alleviate the need for us to orchestrate this.
The module couldn’t be added to the cache until after load
returns with the module’s source, and since the validation would happen during load
we would never have cached modules that would have failed the validation check. This would mean, I think, that we should never have race conditions.
Also @aduh95 you’re always welcome to join the loaders meetings, you don’t need an invitation. Subscribe to this repo and you should see an issue created a few days before each meeting with the meeting details.
The language spec actively prohibits a different module to be loaded based on the assertion - in other words, resolve
shouldn't even get the assertion information. The assertion only determines if the module will load, or will fail.
Is this table for the current nodejs/node#40250 PR, or a potential future where the assertions are optional?
It could be either, considering that no one is supporting implementation nodejs/node#40250, I was thinking about re-using it.
The module couldn’t be added to the cache until after
load
returns with the module’s source, and since the validation would happen duringload
we would never have cached modules that would have failed the validation check. This would mean, I think, that we should never have race conditions.
Maybe we are talking about a different cache: if a user imports twice the same specifier, and resolve()
returns the same url
for both, I don't think we want to call load()
twice, I'd say we want use the same module job – even though load()
hasn't returned the final source yet.
The language spec actively prohibits a different module to be loaded based on the assertion - in other words,
resolve
shouldn't even get the assertion information. The assertion only determines if the module will load, or will fail.
To be precise, the spec proposal says moduleRequest.[[Assertions]] must not influence the interpretation of the module or the module specifier; instead, it may be used to determine whether the algorithm completes normally or with an abrupt completion.
The question is if we can let user provided loaders violate the spec if they dare. I've added a line to the table for scenario on which resolve()
doesn't get the import assertion.
What about adding a new loader hook, which would only return whether the assertion passes or not?
@GeoffreyBooth
About:
import data from './data.yaml' assert { type: 'yaml' }
From my point of view, this example is invalid. I think that assertions should be about what the specifier is translated to, not what it's being translated from.
If the code is run with a YAML loader that translates YAML to JSON, then the assertion in the code should be { type: 'json' }
, and the loader doesn't need the assertion to behave correctly.
What about adding a new loader hook, which would only return whether the assertion passes or not?
That's definitely worth considering. Also maybe something that would correspond to the HostGetSupportedImportAssertions
abstract operation in the spec (I could imagine load()
hook returning this list).
From my point of view, this example is invalid. I think that assertions should be about what the specifier is translated to, not what it's being translated from.
I think the idea is that if a TC39 proposal for YAML modules comes up, someone could use a loader hook to emulate support for it in Node.js.
What about adding a new loader hook, which would only return whether the assertion passes or not?
We can’t really add more loader hooks because of chaining. We just went through a monthslong redesign to collapse four hooks to two to make them chainable. However, making utility functions that split out these smaller pieces that are within the larger hooks would be doable, and should provide many of the same benefits. So if you’re writing a loader that only wants to override the assertion validation, for example to make it always pass, you could create a load
hook that pulls in the utility functions for ‘load this file from disk’ and ‘determine the format of this file’ and then write your own code (or lack thereof) for the validation.
I think the idea is that if a TC39 proposal for YAML modules comes up, someone could use a loader hook to emulate support for it in Node.js.
Yeah. For example, according to https://web.dev/css-module-scripts/, Chrome already supports this:
import sheet from './styles.css' assert { type: 'css' }
I don’t know what a loader would convert this to; ESM JavaScript that is just export default 'the css contents as a string'
? So I thought YAML would be a better example. But the YAML example could also be converted by our loader into export default {"...
where the export is the JSON dropped into JavaScript, and so then the loader returns { source: jsString, format: 'module' }
. In that case, if the assertion needs to match the return value from the loader then the original import would have needed to lack an assertion, because the loader is outputting JavaScript rather than JSON. This feels wrong to me, as the user shouldn’t need to change the original source code based on which custom loader they’re using.
Importing YAML directly into a node.js program seems like it could be pretty useful. How far away are we from such a reality?
Importing YAML directly into a node.js program seems like it could be pretty useful. How far away are we from such a reality?
It’s unlikely to be ever supported natively, unless spec committees add support for it like they have for JSON; but you can write a loader to add custom support for it today.
you can write a loader to add custom support for it today
What is the best resource to get me started if I wanted to write something like that? I found this article online but I'm wondering if there is a more proper document for me to refer to when trying to write a loader?
I'm mostly curious about this for my own edification. I doubt I'd be able to write anything official, but it could be helpful internally or for a side project.
Thanks for your quick feedback and your time!
UPDATE: official documentation is probably where I should start...
UPDATE: official documentation is probably where I should start…
Yes, the official docs should be all you need. The “transpiler loader” example is the pattern you’d want to follow; you’re essentially transpiling YAML into JSON.
Here's a draft of what solution I can think of regarding relations between loaders and import assertions.
Relation of
ESMLoader
class with import assertions:resolve()
resolve()
resolve()
andload()
load()
load()
importAssertions.type
value is supportedimportAssertions.type
to check for race conditionsfinalFormat
is compatible withimportAssertions.type
importAssertions
toresolve()
importAssertions.type
value is supported, and use it to check for race conditionsfinalFormat
is compatible withimportAssertions.type
importAssertions
toresolve()
finalFormat
is compatible withimportAssertions.type
importAssertions
toresolve()
importAssertions
toload()
importAssertions.type
value is supported and compatible withfinalFormat
importAssertions
toresolve()
importAssertions
toload()
and letload()
do the checksimportAssertions.type
to check for race conditionsimportAssertions
toload()
and letload()
do the checksimportAssertions
toload()
and letload()
do the checks[^1]: if the resolver returns an
url
that's already in the cache and the returnedimportAssertion.type
has a value different from the one in the cache, throw aTypeError
.Unflag import assertions
PR, drawbacks of this approach have been discussed.resolve
hook. Not sure this is actually doable and/or desirable.importAssertions
ofresolve()
toload()
context
. This adds complexity, not sure it's worth it.resolve()
.My (new) favourite is
2.
: it seems to me like the perfect balance between simplicity and letting users do powerful things (e.g.: it would be quite easy to implement a loader that supports assertionless imports). I have a PR almost ready implementing this.Anyway, we can discuss that on next week meeting since Goeffrey invited me, but happy to hear thoughts before that if people want to share them.