Open timfish opened 1 month ago
Thanks for the issue, if you want to submit a patch, feel free to do so, if not, I'm sure a member of the team will be happy to assist
@nodejs/loaders correct me if I'm wrong, but wouldn't this behavior also cause errors in the experimental HTTP import system?
This seems like expected behavior to me. It can't create a URL with that parent. It can't determine that the module doesn't exist if it can't first figure out what URL to try to retrieve.
Ack, didn't mean to add that label, I was looking to see if we had a "bug" label and I must've misclicked it
Node is already correctly determining that the file does or does not exist. As described above in the original issue, the error we're encountering actually occurs when Node tries to construct the ERR_MODULE_NOT_FOUND
error to throw.
I agree that the situation that import-in-the-middle
gets us into (parentURL === 'node:util'
) is a weird one, but Node is so close to handling it in a sensible way. For instance, if fileURLToPath(base)
errored out while constructing the ERR_MODULE_NOT_FOUND
error, it could instead use base
verbatim.
This seems like expected behavior to me. It can't create a URL with that parent. It can't determine that the module doesn't exist if it can't first figure out what URL to try to retrieve.
Even so, IMO a more descriptive error may be helpful, rather than the one thrown by the conversion utility
node:util is not a valid <...>
?
It can't determine that the module doesn't exist if it can't first figure out what URL to try to retrieve.
It has already determined that the module doesn't exist, it's just trying to construct the error message when it throws.
I can't see how this is expected behaviour. the code assumes base
is a file URL and it is not.
If parentURL
can only be a file URI and never a node:
URI (or any other URI!) this should be enforced earlier in resolve
.
Just be aware that enforcing this in resolve will break import-in-the-middle
and everything that relies on it!
If
parentURL
can only be a file URI and never anode:
URI (or any other URI!) this should be enforced earlier inresolve
.
And if we did, what error would be thrown? Probably ERR_INVALID_URL_SCHEME
, I’d think. In which case, why does it matter whether it’s thrown here or earlier?
Throwing earlier would ensure consistent behavior. Right now, everything happily works with a node:*
parentURL
provided that the file being resolved exists. I want Node to either always use the normal resolving behavior with parentURL = 'node:*'
, or always error out quickly and obviously.
Is node:
even an invalid scheme for parentURL
?
In which case, why does it matter whether it’s thrown here or earlier?
The current error doesn't even tell you which parameter was the wrong scheme. It's totally useless in terms of helping you fix the issue. Because there's no way to debug loader hooks, its difficult to track down.
There's a huge difference in terms of DX between an unintentional exception and one designed to guide users of an API.
The entire node codebase is full of places where guiding users in this way has been prioritised over just letting random exceptions happen anywhere.
Right now, everything happily works with a
node:*
parentURL
provided that the file being resolved exists.
What do you mean? Resolving a specifier where the URL exists, regardless of parentURL
, works even with a node:
-scheme parentURL
?
The current error doesn’t even tell you which parameter was the wrong scheme. It’s totally useless in terms of helping you fix the issue.
Sure, this could be improved by having the error message specifically refer to parentURL
. Many functions have validation steps, like validateString
calls; we could validate that parentURL
is of file:
scheme right off the bat at the top of the function so that it errors early (and informatively, and consistently). I think an error specific to the URL scheme is more correct than an ERR_MODULE_NOT_FOUND
, though. The latter sounds like something was just missing from disk, whereas really we couldn’t even look for something on disk because we couldn’t parse the URL.
Because there’s no way to debug loader hooks, it makes it even more obscure.
It is definitely challenging to debug code that’s not on the main thread, but it’s not impossible. fs.writeFileSync(1, 'something')
is equivalent to console.log('something')
.
What do you mean? Resolving a specifier where the URL exists, regardless of parentURL, works even with a node:-scheme parentURL?
This is exactly what I mean, yes. You can try it for yourself with the example in the original issue; replace file:///dost-not-exist.mjs
with an absolute file URL pointing to a file that does exist. For instance, file:///Users/nathan/git/register-hook-playground/index.mjs
is a file that exists on my machine, and running the reproduction example does not error with that path as the specifier
. Even a slight modification to make the path refer to a file that does not exist, for instance file:///Users/nathan/git/register-hook-playground/indexx.mjs
, produces an error.
The latter sounds like something was just missing from disk, whereas really we couldn’t even look for something on disk because we couldn’t parse the URL.
I think there's some confusion here. Again, the error that is currently thrown is not thrown because Node couldn't "look for something on disk". Node does look for the file, determines it does not exist, and is in the process of constructing an ERR_MODULE_NOT_FOUND
error when it tries to parse the base URL for inclusion in the error, and that is what fails. The link to the Node source in the issue shows where this is happening:
Throwing earlier would ensure consistent behavior
Throwing in resolve
when parentURL
is a Node URI would fundamentally break import-in-the-middle
which has 4.5m weekly downloads and is relied on heavily in the ecosystem.
To fix this issue, it simply needs to not attempt converting base
to a file path if it's not a file path when constructing the error message. It's a one line fix and I can submit a PR.
Throwing in
resolve
whenparentURL
is a Node URI would fundamentally breakimport-in-the-middle
which has 4.5m weekly downloads and is relied on heavily in the ecosystem.
This isn’t really relevant. If we have a bug, and/or if they have a bug, then it needs to be fixed. The question is what the intended behavior of parentURL
is. If our documentation says that it’s only used when the specifier is relative, and ignored otherwise, then the current behavior is fine and your one-line fix is fine. But if there’s somewhere where we document that parentURL
needs to fulfill certain criteria, like it needs to be a file:
URL and not some other type of URL or something completely different like an object, then we have a bug in that our actual behavior doesn’t match our intended behavior. If import-in-the-middle
relies on this bug, well, then fixing the bug would be handled carefully, but it would still need to be fixed. But I think the former is more likely and a simple fix would be fine.
The documentation on https://nodejs.org/api/module.html#resolvespecifier-context-nextresolve doesn't say too much about parentURL
:
The module importing this one, or undefined if this is the Node.js entry point
I could interpret this to mean that node:
module specifiers should be allowed.
@timfish my hope is that you/we could pursue a fix to import-in-the-middle
in parallel with this. My guess is that a whole lot of people will hit this very soon with your new version of the Sentry SDK, and I can't imagine that waiting for the Node fix to be released and waiting for all Sentry users to jump to the latest version of Node will work well.
Throwing in resolve when parentURL is a Node URI would fundamentally break import-in-the-middle which has 4.5m weekly downloads
This isn’t really relevant. If we have a bug, and/or if they have a bug, then it needs to be fixed. The question is what the intended behavior of parentURL is.
It's highly relevant.
@wesleytodd recently mentioned on Twitter that over the years Node has gone out of their way not to break the ecosystem where possible.
The documentation is ambiguous at best. Node has accepted these parentURL
s for years through numerous LTS versions and changing this would be considered a breaking change by many.
After this amount of time the "intended behaviour" is irrelevant and the usage in the ecosystem trumps that. It's far less painful to update the docs to reflect the current usage rather than break longstanding packages because the intended behaviour wasn't documented fully in the first place.
@wesleytodd recently mentioned on Twitter that over the years Node has gone out of their way not to break the ecosystem where possible.
I am not completely sure which topic is referenced here (probably a sign I need to stop checking twitter so much), but I think generally this references how CITGM is in place to help catch ecosystem breaking changes. That said, I am not sure the same goals apply between changing http
in a way that breaks express
and this which is still in an experimental status.
I don't know enough to make a call, and don't have the time availability to come up to speed on this at the moment. Sorry I cannot be more help.
I opened a PR to fix this in import-in-the-middle
: https://github.com/DataDog/import-in-the-middle/pull/76. While I'm still glad that Node is willing to consider a behavior change here, this will hopefully help folks in the meantime.
Version
v22.1.0
Platform
All
Subsystem
module
What steps will reproduce the bug?
This was triggered by a combination of
import-in-the-middle
andtsx
, but below is a minimal reproduction without either of those.import-in-the-middle
setsparentURL
to node build-ins (node:*
).tsx
tries to resolve files that do not existThe
hook.mjs
index.mjs
Which gives the following error:
This is caused by this code where
base
isnode:util
: https://github.com/nodejs/node/blob/87b87a8f6017545e2c9d6048d7b081f73c8a1072/lib/internal/modules/esm/resolve.js#L264-L265How often does it reproduce? Is there a required condition?
Every time
What is the expected behavior? Why is that the expected behavior?
This should throw a valid
ERR_MODULE_NOT_FOUND
errorWhat do you see instead?
It's throwing while creating the message
Additional information
This was made incredibly difficult to debug because you can't debug loader hooks!
Credit to @nwalters512 for the back and forth debugging with me!