nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
106.66k stars 29.09k forks source link

Add npm: protocol for imports from node_modules #44492

Closed clshortfuse closed 1 year ago

clshortfuse commented 2 years ago

What is the problem this feature will solve?

Allowing packages to be imported with a valid URL scheme will allow NodeJS to stay within spec of imports.

Import statements does not have explicit indication of actual source of package and must be parsed and environment assumed. This applies to bundlers and runtime as well.

Already with the node: protocol code can coexist on browser and NodeJS environments by analyzing the protocol:

let packageUrl = new URL('node:stream/consumers')
if (packageUrl.protocol === 'node') { ...

The only way to perform this on browsers is to literally parse the URL and see if it's a relative. With the introduction of http and https protocol, this just gets harder.

What is the feature you are proposing to solve the problem?

Add npm: protocol for imports from node_modules

This allows easier managing of multiple environments. Bundlers can strip and rename npm: imports as they already do, but it will no longer a mandatory part of deployments. Async imports can easily be redirected (eg: npm: => https://).

It would also open the door to somebody contributing a protocol handler like deno:pkg on NodeJS (and vice versa: npm:pkg on Deno). It also makes npm: just another protocol which may (or may not) be easier to separate from NodeJS itself. That means another package manager can arise in the future that can coexist (or even replace) npm.

What alternatives have you considered?

I've been considering this for a few months, and I haven't really found much alternative. I did go down a rabbit hole of custom handlers. npm: can just be one of the protocols that node registers by default. Some experimentation shows that if we do the route of custom handlers, npm: and node: can be registered before user code (like how browsers "register" http://). That could open the door to 3rd party custom protocols within the node environment. That means instead of npm being the one to manage workspaces, imports can be clearer still as import webserver from 'company:webserver'.

I don't know how browsers treat custom-handlers, but I honestly didn't experiment much beyond await import('web+coffee://index.js')

aduh95 commented 1 year ago

I think this proposal just need someone to open a PR to implement it, I don't see why it would be rejected.

targos commented 1 year ago

I think we might be able to get consensus on supporting a new protocol, but I expect pushback on npm: because npm isn't directly involved in it. It's just one of the package managers that generates the node_modules structure.

aduh95 commented 1 year ago

I think npm refers to the npm registry, not npm the package manager.

targos commented 1 year ago

If that's what it refers to, then I would be against it. Node.js doesn't know whether the specifier refers to something that's in the npm registry, some other registry, a virtual file system, or anything else.

clshortfuse commented 1 year ago

I originally meant npm to be node_modules and, yeah, use all the module resolution that works with it.

But I hadn't considered it being looser, allowing npm to dictate where it pulls the file from (eg: actually performs a fetch against a registry).

In reality, I would just like import to align better with URL (as per spec). Webpack has to infer node_modules and I have to coax it to work with my platform agnostic packages. I do this with node internals doing await import(new URL('node:fs')). Webpack won't attempt to bundle. But that doesn't work for node_modules because new URL('fastify') is invalid.

How environments (deno/node/Chrome) treat the prefix (protocol) is technically not important. What matter is there is one. Node can do one thing, and deno another. Of course, from a standards position, I'd hope they'd do the same, but again, not necessary.

Just spit balling, if npm technically refers to the package manager, then nm: can be node_modules. And along those lines, if npm: were to refer to the registry (with or without local caching), then somebody could technically create github: for that respective registry. And I can also, maybe, use company: for my internal packages. Really puts into consideration how much we'd really rely on npm workspaces. I lean more on allowing imports to be more clearly specified in the JS via URL, rather than let external processes handle it behind the scenes "automagically".

Edit: Though... having my server deployments not tied to the file system does sound interesting (with npm:). But I guess https:// does exist as well.

bnoordhuis commented 1 year ago

Downloading code at runtime is problematic for production deployments. It's certain some users will want a flag to disable it and then the question is: what should that npm:express import do?

aduh95 commented 1 year ago

As https://github.com/nodejs/node/issues/44492#issuecomment-1311697720 pointed out, npm: is not a great name if it actually means "package located in a local node_modules folder", and if the goal is to align with Deno by introducing something that uses the same syntax as Deno but does a completely different thing, arguably we are not improving compat.

silverwind commented 1 year ago

Downloading code at runtime is problematic for production deployments. It's certain some users will want a flag to disable it and then the question is: what should that npm:express import do?

In a production environment with deno, you would either bundle (reduce app to a single file) or vendor (download dependencies into local folder) to eliminate runtime downloads. I agree that a flag would be needed in node to prevent runtime downloads.

I would recommend not using npm: to route to node_modules because as I see it, the modern way to handle dependencies is with a immutable global cache where no npm or node_modules is involved.

bnoordhuis commented 1 year ago

I believe the consensus here is "no." I'll go ahead and close this out.

clshortfuse commented 1 year ago

@bnoordhuis I don't think that's the consensus, but there is some confusion between what I originally thought up and what deno ended up creating.

I think we might be able to get consensus on supporting a new protocol, but I expect pushback on npm: because npm isn't directly involved in it. It's just one of the package managers that generates the node_modules structure.

It's probably best to close and submit a new issue without the ambiguity. What I want would be nm: which just targets /node_modules/*, like node: targets internal packages. It actually has no relation to the npm manager or the online registry at npmjs.com

panva commented 1 year ago

Nor Deno, or Bun want to maintain a node_modules structure, so the interoperability goal of nm:, same as npm:, and how it could be achieved, is eluding me.

silverwind commented 1 year ago

I do see a benefit of node recognizing and stripping a npm: prefix from imports just so that deno scripts can run unmodified in node.

bnoordhuis commented 1 year ago

I think it should be possible to implement that through a custom loader with --experimental-loader?

My gut feeling is that deno <-> node cross-compat is currently fairly niche. Rather than making rash decisions we may end up regretting later, it'd be better to punt on what npm: should mean for a while.

silverwind commented 1 year ago

True. There is also bun which currently supports npm modules without the prefix, so between the three engines, deno is the outlier with the prefix.

clshortfuse commented 1 year ago

My gut feeling is that deno <-> node cross-compat is currently fairly niche. Rather than making rash decisions we may end up regretting later, it'd be better to punt on what npm: should mean for a while.

To be honest I have no interest in deno. I feel the conversation kinda derailed (understandably) because deno has a working protocol prefix for npm.

My use case is building platform-agnostic code, that can be run on the client or server environments. I'm working on Server-Side-Rendering now and it's getting a lot more interesting. I think the exact use-case before using Webpack trying to bundle import 'module', that I wrapped behind a top level await with new URL:

+ try {
-   optionalModule = await import('module');
+   optionalModule = await import(new URL('module'));
+ } catch {
+   console.warn('module not available');
+ }

For example, I could optionally import Blob based on environment (Browser, Node 12, Node 14, Node 16, etc):

let BlobClass = (typeof Blob === 'undefined' ? undefined : Blob);
// later
if (BlobClass === undefined) {
  try {
    const module = await import(new URL('node:buffer'));
    if ('Blob' in module === false) throw new Error('NOT_SUPPORTED');
    BlobClass = module.Blob;
  } catch {
    BlobClass = null;
  }
}

The problem is, new URL('module') will always through an error because there's no prefix, leading me to believe there should be a prefix, even if 'nm:module'. It actually turns out that I was mistaken about how import() works. It just coerces to String.

It's Node's specific to break things up as: Relative, Bare, and Absolute specifiers (doc link). It's "coincidence" that Relative and Absolute can be parsed as URLs.

Bare specifier resolutions are handled by the Node.js module resolution algorithm. All other specifier resolutions are always only resolved with the standard relative URL resolution semantics.

I wanted the inverse: Bare for node_modules cannot be expressed as a URL in any way whatsoever. I see now I could have probably worked around that by not using URL and some other of obfuscation so Webpack didn't try to pack it. Still, would be nice to stay consistent.

jimmywarting commented 1 year ago

would also wish for a optional npm: prefix to be supported. (even if it didn't do anything special) it would make it much more easier to resolve dependencies using URL constructor... and doing switch statement.

new URL('npm:express', 'https://github.com')

just using new URL('express') throws an error...

just using import('express') dose not feel so web-ish... trying it out in browser just screams out errors:

import(URL.createObjectURL(new Blob(['import x from "express"'], {type:'text/javascript'})))
// Chrome/FF: TypeError: Failed to resolve module specifier "express". Relative references must start with either "/", "./", or "../".
// Safari: TypeError: Module name, 'express' does not resolve to a valid URL.

I would wish for eg: require.resolve and import.meta.resolve to be able to work with npm:

// this currently works ok (with a flag: --experimental-import-meta-resolve), 
// but i honestly would kind of expect it to throw a TypError instead
// just how browser throws an error when using `import.meta.resolve`
// TypeError: Failed to execute 'resolve' on 'import.meta': 
//    Failed to resolve module specifier ewf: 
//    Relative references must start with either "/", "./", or "../"
import.meta.resolve('esbuild') 
// file:///Users/<project>/node_modules/esbuild/lib/main.js
import.meta.resolve('npm:esbuild')
// file:///Users/<project>/node_modules/esbuild/lib/main.js
require.resolve('esbuild')
// file:///Users/<project>/node_modules/esbuild/lib/main.js
require.resolve('esm:esbuild')
// file:///Users/<project>/node_modules/esbuild/lib/main.js

I think it would just be better if npm:esbuild was used instead... i think it could simplify things for some ppl. and it would also make import.meta.resolve more cross env friendlier. where as import.meta.resolve(name) would only work in NodeJS.

aduh95 commented 1 year ago

just using import('express') dose not feel so web-ish... trying it out in browser just screams out errors

That’s only if you don’t have an import map that allows the browser to resolve express to a valid URL. Using npm:express would not improve the situation one bit for web compat.