npm / rfcs

Public change requests/proposals & ideation
Other
730 stars 240 forks source link

[RRFC] Validate ESM before npm publish #665

Open brillout opened 1 year ago

brillout commented 1 year ago

Motivation ("The Why")

Many npm packages contain invalid ESM, which makes Node.js choke.

The situation is quite bad, as described in https://github.com/nodejs/node/issues/46074.

Example

For example, the npm package @aws-amplify/ui-react contains invalid ESM, see https://github.com/aws-amplify/amplify-ui/issues/3155.

How

Current Behaviour

Npm packages can be published while containing invalid ESM.

Desired Behaviour

Validate npm packages before they are published: npm should reject any npm publish that publishes invalid ESM.

References

For reference, https://publint.dev is a project that validates whether an npm package is published correctly.

ljharb commented 1 year ago

npm just wraps up a tarball and publishes it - it doesn't validate CJS packages, why would it validate ESM ones? What about packages that support multiple node versions - how would npm validate "not the current node version"?

brillout commented 1 year ago

@ljharb The bottom line is that a lot of npm packages are being published with invalid ESM that make Node.js choke. The question here is: what can we do about it? As discussed in https://github.com/nodejs/node/issues/46074, we don't want to make Node.js permissive if we don't strictly need to. So that leaves us with the only option of making npm strict about the code it hosts. Or maybe you're seeing another solution I'm not seeing?

The situation is untenable and we really need a solution. (I'm the author of vite-plugin-ssr and the, by far, number 1 issue users are reporting is Node.js chocking on npm packages having invalid ESM.)

What about packages that support multiple node versions - how would npm validate "not the current node version"?

The validation doesn't have to cover everything. Only validating basics things like the following would make a big impact already:

a widespread problem is npm packages having invalid ESM imports, e.g. import './foo' instead of import './foo.js', which makes Node.js choke. Another example is packages that ship ESM but don't have type: "module" in their package.json. Node.js should be able to execute npm packages that ship invalid code.

ljharb commented 1 year ago

npm can't and shouldn't be strict about what it hosts, and i very much agree node can't and shouldn't be permissive - so the only solution is education. Very few packages even need to be ESM (or should be!) and that's part of education too.

brillout commented 1 year ago

I understand npm should be agnostic. But I'm thinking the situation is so bad that it warrants an exception.

As much as I'd like education to be the solution, I'm afraid it doesn't cut it. I already have documentation about invalid ESM on https://vite-plugin-ssr.com but it doesn't make much of a difference. I'm afraid nothing will happen if we close this ticket. On the other hand, if npm validates ESM, then that would completely resolve the situation. That'd be a godsend for a lot of users. Also a godsend for library maintainers (e.g. I've already reached the Apollo GraphQL team 3 times because they repeatedly shipped invalid ESM code).

To quote a user hitting a Node.js issue because an npm package has invalid ESM:

Now there is some cryptic node bug that showed up out of nowhere JS libraries are a special kind of punishment

The situation is painful.

I'm thinking that the (only?) way forward here is to make npm validate ESM.

ljharb commented 1 year ago

Based on the discussion in that node thread, it seems like the reason it's painful it's vite's decision to "draw a line in the sand". It doesn't seem problematic with other bundlers - am i missing something?

brillout commented 1 year ago

Vite transpiles only what is strictly necessary (that's why it's so much faster). This means that with Vite, Node.js directly executes code (whereas Next.js transpiles the code before Node.js executes it). That's why the situation is worse in the Vite ecosystem.

It also affects folks equally badly who directly use Node.js (without a bundler like Next.js/webpack).

ljharb commented 1 year ago

right, but the pain you're describing seems like it is the very necessity that needs more transpilation.

bnb commented 1 year ago

-1 on having npm check this. This should be tooling built by the ecosystem, not something Node.js nor npm maintain. Also, having npm do this would not actually solve the problem - there are other package managers šŸ˜…

brillout commented 1 year ago

One of Vite's foundational innovation is "lazy-transpiling": only transpile what is strictly necessary. Transpiling all npm packages only because some contain invalid ESM is a massive performance degradation. Vite values performance a lot and this isn't going to change.

I think everyone can agree that we should foster valid ESM, and that it would be a good thing if npm hosts ESM that is valid. (It doens't have to be 100% valid: a few basic checks can get us a long way.)

It's also nice for folks who don't use bundlers (e.g. Node.js servers without frontend) to be able to directly use npm packages without the need for a transpiler.

Making npm validate ESM is a very effective way of achieving this.

I understand the general principle of keeping npm agnostic but I strongly believe we should do an exception as I think the pros outweigth the cons, by far.

The most important here: it's a net win for everyone, starting from the users who suffer from this. Also including library authors who will be relieved to know that they shipped valid ESM. (I can imagine that the Apollo GraphQL would love this.)

ljharb commented 1 year ago

That's fine - but a package using the "module" field, and not the "exports" field, is one that it is necessary to transpile. Similarly, if a package is published with syntax that isn't supported by the target environments, the only choices are to transpile it, or, not use that package - "use it as-is" isn't a viable choice.

ruyadorno commented 1 year ago

@brillout a good place to start might be having package authors opting into some kind of validation (like publint that you mentioned), e.g:

$ npm pkg set "scripts.prepublishOnly=npx publint"

This might be a simple patch to submit to a few projects/teams (like in your example with the Apollo GraphQL team) to try and see if you can get some meaningful improvement to the ecosystem prior to having it builtin to package managers.

brillout commented 1 year ago

The way I see it is this: it's either pain for (Vite / Node.js server) users, or npm validates ESM.

I'd love an alternative as much as you do, but I don't see any.

The unfortunate truth is that users are suffering, and I wonder what we can we do about it.

brillout commented 1 year ago

@ruyadorno Sounds nice, although it's virtually impossible to educate the entire ecosystem to do this. That's why I believe in npm validating ESM, and so by default.

benmccann commented 1 year ago

I don't get the sentiment that it's outside the scope of the Node Package Manger to check that packages it publishes are compatible with Node. Potentially you could say it should be the registry responsible for this rather than the CLI, but the idea that we're okay with a majority of packages being broken and it's the consumers job' to fix those packages by running them through bundlers instead of running them on Node is a very anti-Node position for the Node Package Manager to be taking.

I think a lot of the confusion here may be that the issue title refers to ESM, but this really isn't just an ESM issue. You can write invalid CJS packages as well. And we see that happen constantly. E.g. "type": "module" and CJS code with .js extension - the CJS code won't run on Node. Node packaging has gotten more complicated since the introduction of exports, .cjs/.mjs, etc. and we see packages published to npm that won't run on Node on a non-stop basis.

anonrig commented 1 year ago

I strongly agree that the goal of keeping npm agnostic has given the ability for developers to ship invalid packages to the registry and poisoning the transition from CJS to ESM.

Iā€™m sad to see that an OSS maintainer had to open an issue to the most used js runtime, and later to npm (owned by github, owned by ms) to solve a validation problem caused by the initial goal of ā€œagnosticā€ path npm is taken. This is clearly a validation problem.

As a Node core collaborator, and performance enthusiast, from where Iā€™m sitting (after reading all of the responses), I strongly believe that we should do better. Since Node does not have the capacity to validate npm packages before publishing, I believe it should definitely be done in the package manager side.

ljharb commented 1 year ago

@benmccann it's not "the node package manager" - it's closer to "node-packaged modules". Many packages don't work with node - they're browser-only, or they're just wrapping files in a tarball.

People publish packages with all kinds of mistakes in them - they forget to run the build process, or they misconfigure exports, or they publish CJS in a way that node thinks is ESM or vice versa, etc. There's tons of failure modes, and the solution to all of them is - and has been - education.

Creating friction in npm for package authors (by way of necessarily opinionated validation) as a crutch to avoid them having to learn how to properly configure a package doesn't seem like a helpful approach to me.

I completely agree that this isn't just an ESM issue, and framing it as such isn't helpful - but nobody's needed such validation for the last decade. Why now?

dominikg commented 1 year ago

It's pretty clear to me that validation was needed before. The added complexity with esm/cjs just made it more obvious.

Look at the amount of misconfigured packages on the registry and all the bug reports, frustration and friction they are causing. I've spent days on this. It's taking a toll on open source projects large and small because of interoperability problems that should be a thing of the past. And it puts the npm registry into a bad light too. If a user encounters an issue with one package, they likely attribute it to that. Multiple apples that turn out to be spoilt, maybe the shop owner is to blame?

Both the registry and package managers can statically analyze packages and warn/reject them if they contain erroneous configuration. This isn't opinionated validation, it is providing help and education for package authors and fosters a safe ecosystem.

This is front and center on npmjs.com (emphasis mine)

Relied upon by more than 11 million developers worldwide, npm is committed to making JavaScript development elegant, productive, and safe. The free npm Registry has become the center of JavaScript code sharing, and with more than one million packages, the largest software registry in the world. Our other tools and services take the Registry, and the work you do around it, to the next level.

Many other large software registries have validation processes to ensure a quality level for their content. What alternative ways do you suggest to help educate package authors and reduce the amount of clutter?

benmccann commented 1 year ago

@ljharb thanks for taking the time to respond. Educating users is a fine approach as long as we're being proactive about it and ensuring they discover the necessary info at scale. Right now, it's not happening as it's depending on those users to be very proactive in educating themselves. There's a number of ways we could better educate users. E.g. we could put a UI feature in the registry which indicates whether a package is compatible with Node.js. Or we could print a single line in the CLI when publishing a package saying that the package will not be consumable by Node.js, but only via bundlers. I think there are lightweight approaches that we can take without being overly opinionated or creating friction for package authors.

I work on SvelteKit and we get tons of bugs saying a given package is broken with SvelteKit when it's really not related to SvelteKit at all, but just Node.js. I've spent an inordinate amount of time educating users that their packages are not following the Node.js spec, written an FAQ on our site, fixed tons of widely-used packages in the ecosystem, but I'll just never reach the long tail of users. We keep seeing over and over again tons of package authors just completely unaware that their packages are not compatible with Node. It'd be nice to let them know that earlier in the process before waiting for their package to become adopted enough that users start filing bugs against them. We could build some feature into SvelteKit which scans all of a user's dependencies and lets them know which ones are broken, but it doesn't feel like something that every web framework should build, but rather something that should happen much earlier in the process such as before a package is even published.

ljharb commented 1 year ago

What validation would you propose, precisely? It'd have to be things that are always wrong, in every use case.

For example, it could check that the "main" (explicit or implicit), or any file referenced in "exports", exists on the filesystem. npm would have to know how to do extension and index resolution for "main" but not for "exports".

However, it couldn't execute any files safely, so it'd have to bundle a parser in order to validate the syntax of anything, which is likely a pretty large dependency. Additionally, it'd have to understand exactly which set of syntax is valid for an arbitrary node version, and for a package who declares engines.node, it'd have to be able to test the entire range of valid node versions, not just the current node version.

Additionally, since Module and Script are ambiguous parse goals, it's not actually possible to know 100% of the time whether a file is intended to be CJS or ESM (which is why node itself doesn't use that approach).

benmccann commented 1 year ago

I would not suggest examining the contents of any files other than package.json. We do not have to catch every possible type of error to provide value, but just point out things that that clearly won't run on Node.js from the package.json alone. I also think that we can share information without it being a warning or error. E.g. we can say "This package won't run in Node.js. If that is unexpected go here to learn more". As an example validation, we can say that if you have "type": "module" then something appearing as a value for require under exports must have the extension .cjs if you want it to run on Node.js.

stalkerg commented 1 year ago

Yes, I suppose providing warning will be much better than just an error, also it will be possible to indicate about wrong ESM module here https://www.npmjs.com/ (will be helpful for the users)

bnb commented 1 year ago

The way I see it is this: it's either pain for (Vite / Node.js server) users, or npm validates ESM.

@brillout npm validating ESM will not solve the problem you want it to.

I don't get the sentiment that it's outside the scope of the Node Package Manger

npm is not an acronym :)

but the idea that we're okay with a majority of packages being broken and it's the consumers job' to fix those packages by running them through bundlers instead of running them on Node is a very anti-Node position for the Node Package Manager to be taking.

@benmccann The majority of packages are not broken. I deeply dislike other package managers for a variety of reasons, but their users still deserve the same foundational support that npm users would get under what you're asking for. Further, I shouldn't have to use a package manager for code that I'm not publishing to validate it - I should be able to do that internalyl without a package manager. This is a platform issue, not a package manager issue.

Put another way, solving this in npm does not actually solve it. Further, it creates a very wide fracture in the ecosystem. What happens when Yarn starts implementing a definition of "valid" that's slightly different than npm? And the same for pnpm? This has happened before (see: resolution algorithms) and it'll happen with this. Inserting this into Node.js - if it's even a good idea, which I'm not really convinced of - would be the way to actually solve your problem.

I strongly agree that the goal of keeping npm agnostic has given the ability for developers to ship invalid packages to the registry and poisoning the transition from CJS to ESM.

Iā€™m sad to see that an OSS maintainer had to open an issue to the most used js runtime, and later to npm (owned by github, owned by ms) to solve a validation problem caused by the initial goal of ā€œagnosticā€ path npm is taken. This is clearly a validation problem.

As a Node core collaborator, and performance enthusiast, from where Iā€™m sitting (after reading all of the responses), I strongly believe that we should do better. Since Node does not have the capacity to validate npm packages before publishing, I believe it should definitely be done in the package manager side.

@anonrig As a Node.js contributor, I know that Node.js has more resources to solve this than npm does. If it's a problem that you as a collaborator want to see solved, you should solve it in Node.js.

From the Node.js perspective, that is the option that makes sense - it does not tie the platform to a single runtime and helps all users, rather than the subset who publish with npm (since only modules published from npm would be validated - if ). I'd honestly argue it's actively harmful and preferential to users of npm for Node.js collaborators to suggest that this be solved in npm.

And it puts the npm registry into a bad light too. If a user encounters an issue with one package, they likely attribute it to that. Multiple apples that turn out to be spoilt, maybe the shop owner is to blame?

@dominikg This is an assumption that every user will make a fallacy of composition in the most distilled form. Outside of theoretical situations, can this be backed up?

Many other large software registries have validation processes to ensure a quality level for their content.

Got examples of this in package registries for dynamically typed languages? I briefly looked through both PyPi's resources and Ruby Gem's resources and couldn't find anything like this but perhaps I missed it.

We do not have to catch every possible type of error to provide value, but just point out things that that clearly won't run on Node.js from the package.json alone.

@benmccann this just sounds like you want linting? why do this in ESLint and/or Prettier rather than npm?

benmccann commented 1 year ago

What happens when Yarn starts implementing a definition of "valid" that's slightly different than npm? And the same for pnpm?

We file bugs against them that their validations are not following the Node.js spec, which is pretty well defined. Or, we avoid the problem entirely by building this into the npm registry instead of the npm CLI.

Inserting this into Node.js - if it's even a good idea, which I'm not really convinced of - would be the way to actually solve your problem

That'd help slightly, but not a ton. Right now if you try to run a package with a bad package.json then Node will crash. We could add code to Node that would maybe make the error messages slightly nicer, but it doesn't change a whole lot. The point is that we want package authors to find this out earlier in the process.

@benmccann this just sounds like you want linting? why do this in ESLint and/or Prettier rather than npm?

Users would have to somehow first learn this eslint package exists, setup eslint, and setup this new validation package. It would help some, but it's probably going to be a minority of packages that adopt such a configuration whereas npm has a far wider reach. Every package in the npm registry needs to have a valid package.json, so why shouldn't it live there?

GeoffreyBooth commented 1 year ago

@brillout Letā€™s take a concrete case. You cited https://github.com/aws-amplify/amplify-ui/issues/3155. On my machine I ran npm install @aws-amplify/ui-react and then node --input-type=module --eval 'import "@aws-amplify/ui-react"'. I got the following output:

$ node --input-type=module --eval 'import "@aws-amplify/ui-react"'
(node:81555) Warning: To load an ES module, set "type": "module" in the package.json or use the .mjs extension.
(Use `node --trace-warnings ...` to show where the warning was created)
/private/tmp/test/node_modules/@aws-amplify/ui-react/dist/esm/index.js:1
import*as e from"./components/index.js";export{e as components};export{useAmplify}from"./hooks/useAmplify.js";export{useTheme}from"./hooks/useTheme.js";export{useBreakpointValue}from"./hooks/useBreakpointValue.js";import*as i from"./primitives/index.js";export{i as primitives};export{createTheme,defaultDarkModeOverride,defaultTheme,translations}from"@aws-amplify/ui";export{default as AccountSettings}from"./components/AccountSettings/AccountSettings.js";export{Authenticator}from"./components/Authenticator/Authenticator.js";export{withAuthenticator}from"./components/Authenticator/withAuthenticator.js";export{InAppMessagingProvider,useAuthenticator,useInAppMessaging}from"@aws-amplify/ui-react-core";export{MapView}from"./components/Geo/MapView/index.js";export{Geocoder,LocationSearch}from"./components/Geo/LocationSearch/index.js";export{FileUploader}from"./components/Storage/FileUploader/FileUploader.js";export{default as InAppMessageDisplay}from"./components/InAppMessaging/InAppMessageDisplay/InAppMessageDisplay.js";export{default as withInAppMessaging}from"./components/InAppMessaging/withInAppMessaging/withInAppMessaging.js";export{AmplifyProvider,ThemeProvider}from"./components/ThemeProvider/index.js";export{Alert}from"./primitives/Alert/Alert.js";export{Autocomplete}from"./primitives/Autocomplete/Autocomplete.js";export{Badge}from"./primitives/Badge/Badge.js";export{Button}from"./primitives/Button/Button.js";export{ButtonGroup}from"./primitives/ButtonGroup/ButtonGroup.js";export{Card}from"./primitives/Card/Card.js";export{CheckboxField}from"./primitives/CheckboxField/CheckboxField.js";export{Collection}from"./primitives/Collection/Collection.js";export{Divider}from"./primitives/Divider/Divider.js";export{Expander}from"./primitives/Expander/Expander.js";export{ExpanderItem}from"./primitives/Expander/ExpanderItem.js";export{FieldGroupIcon}from"./primitives/FieldGroupIcon/FieldGroupIcon.js";export{FieldGroupIconButton}from"./primitives/FieldGroupIcon/FieldGroupIconButton.js";export{Flex}from"./primitives/Flex/Flex.js";export{Grid}from"./primitives/Grid/Grid.js";export{Heading}from"./primitives/Heading/Heading.js";export{HighlightMatch}from"./primitives/HighlightMatch/HighlightMatch.js";export{Icon}from"./primitives/Icon/Icon.js";export{Image}from"./primitives/Image/Image.js";export{Link}from"./primitives/Link/Link.js";export{Loader}from"./primitives/Loader/Loader.js";export{Menu}from"./primitives/Menu/Menu.js";export{MenuButton}from"./primitives/Menu/MenuButton.js";export{MenuItem}from"./primitives/Menu/MenuItem.js";export{Pagination}from"./primitives/Pagination/Pagination.js";export{PasswordField}from"./primitives/PasswordField/PasswordField.js";export{PhoneNumberField}from"./primitives/PhoneNumberField/PhoneNumberField.js";export{Placeholder}from"./primitives/Placeholder/Placeholder.js";export{Radio}from"./primitives/Radio/Radio.js";export{RadioGroupField}from"./primitives/RadioGroupField/RadioGroupField.js";export{Rating}from"./primitives/Rating/Rating.js";export{ScrollView}from"./primitives/ScrollView/ScrollView.js";export{SearchField}from"./primitives/SearchField/SearchField.js";export{SelectField}from"./primitives/SelectField/SelectField.js";export{SliderField}from"./primitives/SliderField/SliderField.js";export{StepperField}from"./primitives/StepperField/StepperField.js";export{SwitchField}from"./primitives/SwitchField/SwitchField.js";export{TabItem,Tabs}from"./primitives/Tabs/Tabs.js";export{Text}from"./primitives/Text/Text.js";export{TextAreaField}from"./primitives/TextAreaField/TextAreaField.js";export{TextField}from"./primitives/TextField/TextField.js";export{ToggleButton}from"./primitives/ToggleButton/ToggleButton.js";export{ToggleButtonGroup}from"./primitives/ToggleButtonGroup/ToggleButtonGroup.js";export{View}from"./primitives/View/View.js";export{VisuallyHidden}from"./primitives/VisuallyHidden/VisuallyHidden.js";export{Table}from"./primitives/Table/Table.js";export{TableBody}from"./primitives/Table/TableBody.js";export{TableCell}from"./primitives/Table/TableCell.js";export{TableFoot}from"./primitives/Table/TableFoot.js";export{TableHead}from"./primitives/Table/TableHead.js";export{TableRow}from"./primitives/Table/TableRow.js";export{usePagination}from"./primitives/Pagination/usePagination.js";export{ComponentClassNames,ComponentClassObject}from"./primitives/shared/constants.js";export{ComponentPropsToStylePropsMap,ComponentPropsToStylePropsMapKeys}from"./primitives/types/style.js";
^^^^^^

SyntaxError: Cannot use import statement outside a module
    at internalCompileFunction (node:internal/vm:74:18)
    at wrapSafe (node:internal/modules/cjs/loader:1141:20)
    at Module._compile (node:internal/modules/cjs/loader:1182:27)
    at Module._extensions..js (node:internal/modules/cjs/loader:1272:10)
    at Module.load (node:internal/modules/cjs/loader:1081:32)
    at Module._load (node:internal/modules/cjs/loader:922:12)
    at ModuleWrap.<anonymous> (node:internal/modules/esm/translators:167:29)
    at ModuleJob.run (node:internal/modules/esm/module_job:194:25)
    at async ESMLoader.eval (node:internal/modules/esm/loader:380:24)
    at async loadESM (node:internal/process/esm_loader:102:5)

Node.js v19.3.0

A simplistic validation could be that on npm publish, on the server side after npm has received the package but before the new package or version is published, the npm server could run node --input-type=module --eval 'import "<just-uploaded-package-here>"' in a sandbox and fail the publish if the command errored. Such a command should succeed for all ESM and CommonJS modules, since import can handle both (maybe with import * as test from or something to catch all variations of default/no default/named exports etc., but the details could be worked out). Since plenty of people upload packages that arenā€™t intended to be importable into Node, some option to disable this check would be provided like npm publish --skip-node-import-validation or whatever. And the placement of the validation on the server side means that it could work for all package managers. The npm team might want to limit it to certain package manager clients at first, or perhaps only package managers and versions that support the flag to override the check, but all these details could be worked out.

Aside from the specifics of the implementation, would this solve the issue? In general, if a package can be successfully referenced via an import statement, is it likely to be a package thatā€™s valid to run in Node and would therefore work well for vite-plugin-ssr?

benmccann commented 1 year ago

That sounds like it'd be quite helpful. An implementation note - you'd have to check that the package name can be imported. If there's an export map that doesn't export . you wouldn't be able to do that. I also wonder if you'd want to import every entry point within the export map to check if they all work.

There's potentially a simpler solution though. Check out https://publint.dev/@aws-amplify/ui-react@4.3.2

GeoffreyBooth commented 1 year ago

An implementation note - youā€™d have to check that ā€¦

Like I said, there would be lots of implementation details to work out; but I donā€™t doubt they could be resolved.

publint does something very similar to my suggestion:

How it works When linting an npm package, the site would download the package tarball locally from the npm registry, then a web worker will run the publint CLI against the files. For larger packages, it may take a while to download and lint.

Unlike npm install, the site would only download the package itself without itā€™s dependencies, as thatā€™s sufficient for the linting process.

So itā€™s not just doing some quick scan of package.json itself, and it could take awhile (among other potential issues with running this locally). Itā€™s also just not as thorough of a check as actually importing the package into a real Node process. This sounds like it wouldnā€™t have caught the import 'lodash/debounce' (when it should have been lodash/debounce.js) error cited earlier, whereas running it in Node would.

bnb commented 1 year ago

We file bugs against them that their validations are not following the Node.js spec, which is pretty well defined.

This would duplicate the work across the three existing package managers, and yet again increase the set of requirements to building a successful package manager in the JavaScript ecosystem which would reduce the likelihood of competition.

Or, we avoid the problem entirely by building this into the npm registry instead of the npm CLI.

This also forces internal registries to build these features. There are very good reasons to use internal registries, but I also frankly don't trust JFrog and Sonatype to correctly build this functionality in a way that effectively serves their users - and again, it would increase the barrier to entry for better options to exist.

Right now if you try to run a package with a bad package.json then Node will crash.

not sure there's a lot more npm can do if there's a bad package.json?

Users would have to somehow first learn this eslint package exists, setup eslint, and setup this new validation package.

If this were to ship - which IMO it very much should not in npm - it should be optional and opt-in, not required.

ljharb commented 1 year ago

(while i agree - i think that if it's not required and default-on, then the argument for shipping it in npm dissolves)

benmccann commented 1 year ago

This would duplicate the work across the three existing package managers, and yet again increase the set of requirements to building a successful package manager in the JavaScript ecosystem which would reduce the likelihood of competition.

It would be sensible to create a module that can be shared across package managers. However, missing validation doesn't make a package manager unusable - just less helpful. I don't understand the point about competition because at the same time you're saying that npm should not compete and cannot add anything that might make it better than another package manager

This also forces internal registries to build these features. There are very good reasons to use internal registries, but I also frankly don't trust JFrog and Sonatype to correctly build this functionality in a way that effectively serves their users - and again, it would increase the barrier to entry for better options to exist.

I don't think it would. There's lots of ways to implement this without it being required. E.g. if there's a UI icon saying a package will run on Node.js then there's nothing forcing other registries to adopt the same UI. The registry API docs don't appear to include the publishing APIs, so I can't suggest an exact solution, but it certainly seems possible to me to implement this in a manner that would be backwards compatible with existing clients / package managers.

nlf commented 1 year ago

i agree with all of the points that this validation does not belong in the package manager. this should be an external tool that can be run as part of your npm scripts for the folks who have this need. as was mentioned, the most validation we could reasonably do is ensuring that files referenced by the main and exports fields exist on disk.

validating the content of the actual code is very far beyond the scope of what a package manager should be doing. we don't forcibly run your tests and require they pass to allow you to publish, and i see this request being quite similar. the needs of a package are the responsibility of the maintainer(s) of that package to declare and enforce. today you can publish an invalid commonjs package as well. it's up to you to actually consume your own package and make sure it works for your use case. if it doesn't, then it's also up to you to fix it.

fwiw this would be fixed most appropriately by maintainers who add ESM exports to also add tests that use these ESM exports. something that is, once again, outside of the responsibilities of npm/yarn/pnpm.

benmccann commented 1 year ago

Sadly it is almost impossible to write tests for package.json, which is why this is such a problem. Your project will be interpreted as either CJS or ESM depending on whether you have "type": "module". Folks don't realize the need to use .cjs extensions for CJS code when "type": "module is present and .mjs extensions for ESM code when it's not and there's nothing in the standard development path to inform them of it. You could create an exotic test setup where you have both CJS and ESM test modules pulling in the distribution folder of you main module to discover these errors, but in practice almost no one sets up their projects that way and typically just have a single test folder.

Promoting publint may help with this. I'm very pessimistic about testing helping though

ljharb commented 1 year ago

@benmccann you can do it relatively easily if you use exports and a package self-reference in your tests, as long as you support only node versions that support exports. I certainly agree that "type": "module" causes tons of confusion, but I remain convinced that the solution is to teach people not to use it.

stalkerg commented 1 year ago

@ljharb

I certainly agree that "type": "module" causes tons of confusion, but I remain convinced that the solution is to teach people not to use it.

This is right, BUT it should be an effective way to do it, just tell not enough. Basically, each tool for publishing the package should make a warning for education propose. Also, we have tons of abandoned packages, and users can't figure out if it is possible to use nodejs or not except to install it and try to use it.

brillout commented 1 year ago

While I share the sentiment of keeping the npm registry as simple as possible, I don't see an alternative to adding validation to the npm registry. Let me elaborate.

There are four approaches:

  1. We make Node.js permissive (i.e. making Node.js be able to execute npm packages that have invalid ESM).

    This is being categorically rejected, see https://github.com/nodejs/node/issues/46074.

  2. We encourage library authors to publish valid ESM (e.g. with scripts.prepublishOnly or by encouraging them to write ESM tests).

    As much as I'd love this work, I don't think it's feasible to educate enough users to make a real difference here. To quote benmccann (emphasis mine):

    I work on SvelteKit and we get tons of bugs saying a given package is broken with SvelteKit when it's really not related to SvelteKit at all, but just Node.js. I've spent an inordinate amount of time educating users that their packages are not following the Node.js spec, written an FAQ on our site, fixed tons of widely-used packages in the ecosystem, but I'll just never reach the long tail of users.

    With the massive amount of invliad npm packages being published every day, we just don't have the manpower to fix the situation with education alone.

  3. We add validation to the npm CLI.

    I agree this is problematic as it would create fracture between npm/yarn/pnpm.

  4. We add validation to the npm registry (e.g. the registry rejects with a 400, while having an opt-out option).

    • I believe this should be opt-out: e.g. npm publish --skip-validation / npm publish --force would make the npm registry accept everything like it does today.
    • While adding complete validation would be nice (e.g. by testing packages against Node.js), I think we can already get very far with fairly simple checks. For example, detecting import { someImport } from './some-file' (it should be ./some-file.js) only for the files referenced by package.json#exports would already be an enormous added value. We can start simple and see how far we can go.

That's why I've come to the conclusion that the only viable option here is 4: we add validation to the npm registry.

And, actually, the more I think about it, the more I like the idea. I find it very compelling to have a registry that is clean. That'd be a much welcome and tremendous added value of the npm registry.

ljharb commented 1 year ago

npm currently gets sent a tarball and some JSON data. I'm not sure if they even unpack it at all, except perhaps for the relatively new "explore"/"core" feature on package webpages. I suspect it's not feasible, in a practically performant/efficient way, to implement this on the registry side.

benmccann commented 1 year ago

I don't think option 3 has to cause any fracturing. It could be implemented as a shared module

stalkerg commented 1 year ago

@ljharb it's an excellent time to start; validating income tarballs is very important and not so heavy. It will be possible to reject viruses and maybe abuse material. The npm registry can't avoid DCMA and GDPR, so not to get serious troubles should be a way to validate received packages.

ljharb commented 1 year ago

Of course it can, itā€™s a DMCA safe harbor only if it doesn't actively scan the contents of what users upload - otherwise it would have potential red-flag knowledge of violations. Iā€™m not sure how GDPR plays in. Either way, a lawyer would have to weigh in there.

dominikg commented 1 year ago

the last days left me dumbfounded by the amount of pushback, derailing and hardly even acknowledging the issue presented by members of tc39, npm inc and github.

@bnb out of my post you chose to quote 2 statements and apply twists, conveniently not getting to my question regarding alternative options.

1) I disagree with you that what i said is based on a fallacy of composition but arguing about that isn't going to add any value here so please let's agree to disagree here. 2) This has nothing to do with the dynamic typed nature of js, so i don't understand why that is being brought into the discussion.

Sharing the validation logic can be done, which solves the concerns about being agnostic. I am thankful you care about that, but at the same time you should acknowledge that node.js bundling npm is indeed preferential treatment and has lead to your position that the npm registry and client are by far the most used in the js ecosystem and as a result applying validation at that level is the most effective.

Completely agree that it should be the registry itself performing the ultimate steps, but clientside validation isn't a new concept. Look at html form validation. Which coincidentally can be shared between server and client more easily than ever before thanks to properly authored esm packages that can be executed vanilla by browsers, node.js and various js based edge runtimes.

You should be head over heels giddy about esm turning into a lingua franca for a new wave of full-stack frameworks and libraries. Spreading the use of esm far wider than before. And as a central pillar and vital part of the ecosystem collaborate to improve the quality of packages on your registry.

I really hope the current feedback presented here isn't representative of npm inc/github/microsoft as a whole and that something productive can come from this.

ljharb commented 1 year ago

to be clear, "can be executed vanilla" is not an objective, or at all a universally desirable, value.

bnb commented 1 year ago

@brillout FWIW this repository is for the npm CLI, specifically - not the npm registry (the repo being for the CLI is noted in the README). I believe registry feedback is supposed to go in npm/feedback.

Wanting to note this since you specifically stated that your goal is registry changes, and regardless of whether or not I agree with them I want to make sure you're able to get your feedback into the right venue to be heard by the correct people.

bnb commented 1 year ago

I don't think option 3 has to cause any fracturing. It could be implemented as a shared module

@benmccann I'm pretty sure that if you tried to get all three package managers to agree on a single implementation of things, you'll never see this feature ship. Yes, this is technically possible. However, practically/socially... much less so.

bnb commented 1 year ago

I disagree with you that what i said is based on a fallacy of composition but arguing about that isn't going to add any value here so please let's agree to disagree here.

@dominikg If you're not willing to discuss or back up a point you made, it probably shouldn't be made. However, I'll not engage on the topic further as long as it's not used any further as a justification for this proposal, per your request šŸ‘šŸ»

This has nothing to do with the dynamic typed nature of js, so i don't understand why that is being brought into the discussion.

Because I've spent a decent amount of my career looking at / working with package registry products and very often "easy" solutions are not actually easy for registries of dynamically typed languages. I've seen what you're talking about for static language registries multiple times but have yet to see it in a production dynamic language registry. If I'm missing examples - you stated there are many - in dynamic language registries I'd love to learn about them.

Sharing the validation logic can be done, which solves the concerns about being agnostic. I am thankful you care about that, but at the same time you should acknowledge that node.js bundling npm is indeed preferential treatment and has lead to your position that the npm registry and client are by far the most used in the js ecosystem and as a result applying validation at that level is the most effective.

Yarn and pnpm are currently available via corepack in Node.js.

Last I saw, Yarn had non-trivial double digit representation (IIRC it was around 40%) in package manager usage according to npm registry statistics that were made public (this was some time ago, but I don't think it'd be under double digits at this point).

I agree with registry (IIRC it's 99%+), but disagree that moving logic like this into the registry is a good idea. Specifically, as was pointed out earlier, this is something npm explicitly doesn't do. Changing that fundamentally changes the scope of what npm is/does and that has a lot of ecosystem-wide repercussions that IMO are not positive.

haoqunjiang commented 1 year ago

Considering npm has added first-class support for types field in the past, both for the registry and the CLI:

I believe it's reasonable for us to make a similar argument that npm could have better support for Node.js ESM. This could be within the scope of npm.

redbar0n commented 1 year ago

@brillout FWIW this repository is for the npm CLI, specifically - not the npm registry (the repo being for the CLI is noted in the README). I believe registry feedback is supposed to go in npm/feedback.

@bnb Done, I reposted it here: https://github.com/npm/feedback/discussions/862 Validate packages before publishing them.

RobIsHere commented 1 year ago

In the near future someone will develop another registry, an alternative to npm like yarn or pnpm is for npm cli. At npm cli they didn't get going until someone made up an alternative product, There the damage is already done.

All users will switch to this new registry because they don't bother if it's server or cli and the packages from new-validating-npm.com just work for them as easy as 1-2-3. The packages there will be properly validated and tagged "Works in ..." and "Exports provided ESM/....".

The man you work for at npm, Bill Gates, doesn't like one thing at all: alternative products and competitors. His Interest from day 1 is "developers, developers, developers"! Because developers create the applications that make his Systems valuable.

When Bill Gates will ask the quesition why he has to spend another billion to also buy the competitor new-validating-npm.com someone will point him to this thread. And he will see the stance: write something into some feedback channel and don't bother me further because we are cli and server is someone else in our company. What will he think?

ghost commented 1 year ago

Maybe this needs an external scanner, like OSSFuzz for fuzzing. If packages can be validated easily, every new publish could be scanned and an issue automatically opened in the repo.

ghost commented 1 year ago

There's a scanner. https://github.com/bojavou/esm-inspector

You can run with node bin/inspect.mjs. Be aware it's not sandboxed yet.

It found usepanel which uses import without signaling ESM.

$ node bin/inspect.mjs --token=18100000
test-mlw1-diary-scary-tauts-bless: No available package versions
dede-ui: Package threw error
test-mlw1-maims-inurn-khuds-saiga: No available package versions
test-mlw2-maims-inurn-khuds-saiga: deleted
test-mlw3-atopy-hamza-clunk-bagel: No available package versions
test-mlw4-maims-inurn-khuds-saiga: deleted
usepanel: INVALID: Uses ESM syntax in a CommonJS module
awesome-toast-component: Importable from an ES module
material-theme-lib: deleted
chat-vue-app: Importable from an ES module

ng-magnizoom goes the other way and imports an ESM dep with require().

$ node bin/inspect.mjs --token=17000000
meteor-sdk: Package entry not JavaScript
emoji-provider: Importable from an ES module
@galatajs/events: Importable from an ES module
conseiljs-ledgersigner: Importable from an ES module
@galatajs/inject: Importable from an ES module
ng-magnizoom: INVALID: Imports ESM with require()
css.macro: Package unusable after install
@aveonline/ui-vue: Importable from an ES module
@raidguild/use-chiev: Package unusable after install
as-my-story-book-components: Package unusable after install
@galatajs/app: Importable from an ES module
@getselfstudy/react-ui: Package uninstallable
papa-yasi-erlingyiqilingwu-erlingyiqilingliu: Importable from an ES module
papa-yasi-erlingyiqilingqi-erlingyiqilingba: Importable from an ES module
@raidguild/quiver: Package unusable after install
papa-yasi-erlingyiqilingjiu-erlingyiqiyiling: Importable from an ES module

Anyone have a spare server?