nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
106.68k stars 29.1k forks source link

Special treatment for package.json resolution and exports? #33460

Closed ctavan closed 2 years ago

ctavan commented 4 years ago

📗 API Reference Docs Problem

Location

Section of the site where the content exists

Affected URL(s):

Problem description

Concise explanation of what you found to be problematic

With the introduction of pkg.exports a module only exports the paths explicitly listed in pkg.exports, any other path can no longer be required. Let's have a look at an example:

Node 12.16.3:

> require.resolve('uuid/dist/v1.js');
'/example-project/node_modules/uuid/dist/v1.js'

Node 14.2.0:

> require.resolve('uuid/dist/v1.js');
Uncaught:
Error [ERR_PACKAGE_PATH_NOT_EXPORTED]: Package subpath './dist/v1.js' is not defined by "exports" in /example-project/node_modules/uuid/package.json

So far, so good. The docs describe this behavior (although not super prominently):

Now only the defined subpath in "exports" can be imported by a consumer: … While other subpaths will error: …

While this meets the expectations set out by the docs I stumbled upon package.json no longer being exported:

> require.resolve('uuid/package.json');
Uncaught:
Error [ERR_PACKAGE_PATH_NOT_EXPORTED]: Package subpath './package.json' is not defined by "exports" in /example-project/node_modules/uuid/package.json

For whatever reason I wasn't assuming the documented rules to apply to package.json itself since I considered it package metadata, not package entrypoints whose visibility a package author would be able to control.

This new behavior creates a couple of issues with tools/bundlers that rely on meta information from package.json.

Examples where this issue already surfaced:

Now the question is how to move forward with this?

  1. One option would be to keep the current behavior and improve the documentation to explicitly warn about the fact, that package.json can no longer be resolved unless added to exports. EDIT: Already done in https://github.com/nodejs/node/commit/1ffd182264dcf02e010aae3dc88406c2db9efcfb / Node.js v14.3.0
  2. Another option would be to consider adding an exception for package.json and always export it.

I had some discussion on slack with @ljharb and @wesleytodd but we didn't come to an ultimate conclusion yet 🤷‍♂️ .


ljharb commented 4 years ago

cc @nodejs/modules-active-members

wesleytodd commented 4 years ago

other option would be to consider adding an exception for package.json and always export it.

To me this seems like a great solution. Having the package metadata is an awesome ergonomic feature of the current setup, and having module authors explicitly have to opt in would be a huge burden across the community. To me it seems like we would need a concrete reason not to have this exception. Can anyone think of a reason to make this case?

bmeck commented 4 years ago

@wesleytodd I think it just comes down to what is public/private still. People putting public data into a a package.json for things like tools to consume isn't really an issue. People putting configuration data like secrets is more the concern. I imagine it would still be able to be censored if people re-wrote the import still.

However, I'm unclear on the privacy model here since the benefit seems largely to be around bundlers which wouldn't run with the same constraints since they are ahead of time tools and general thought to access things in a more permissive manner than 2 independent modules with mutual distrust. It seems the problem is in part that these tools are using APIs that make them act as the same level of trust as any other module and other packages when they upgrade are removing permissions to view the package.json data (even if by accident). I think the concrete discussion here is if people should have to opt-out of package.json to avoid an accident prone workflow which is the inverse of all other resources in the package.

A different option since there is a specific use case that seems to need this is to have a flag of some kind for these ahead of time tools. Either to require.resolve which looks to be the cause of issues above, to node via CLI/ENV, or something else. I do think providing an exception would make things just work, but somewhat go against the privacy intent of "exports".

jkrems commented 4 years ago

My main concern with exposing it via exports is that there’s two options:

  1. Make it non-configurable. ./package.json always has to map to ./package.json.
  2. Tools that use require or import to load metadata may get custom files because a package decided to remap ./package.json.

The first option would force every package to treat its metadata file as a public API. Some users eslint config reuses what I put into my package.json#eslint section? Well, it’s exported so how can I blame them.

The second option means that tools may not actually get the metadata when they think they’re loading the metadata. It could be argued that it only affects “weird” packages but given a sufficiently large number of users that know about this “trick”, I can totally see people use it.

I think bundlers shouldn’t use require (“load code for this environment”) to load metadata. So I’d rather have a new API to load the package.json that belongs to a specifier/referrer combination, exposing logic we already have in the loader. That API could be used by bundlers etc to cleanly get the metadata without having to hijack require.

ljharb commented 4 years ago

I think the primary issue is that there's no way besides require.resolve to resolve the package.json for a package. Even path.join(require.resolve(pkg), 'package.json') won't work because some packages might have their "main" resolve to a subdirectory.

guybedford commented 4 years ago

Here's the recommended way to do this with ES modules:

import { readFileSync } from 'fs';
(async () => {
  const pkgPath = await import.meta.resolve('pkg/')
  console.log(pkgPath);
  console.log(readFileSync(new URL('package.json', pkgPath)).toString());
})();

The above also simplifies with TLA of course.

Currently the above only executes with --experimental-import-meta-resolve.

@nodejs/modules-active-members I think we should discuss unflagging this feature.

ctavan commented 4 years ago

I also want to clarify that the problems I have seen in the wild were always just about require.resolve('pkg/package.json') in order to then load that file from the filesystem. I didn't see anybody trying to directly require('pkg/package.json') to really load the json data as a module.

ljharb commented 4 years ago

@guybedford import.meta.resolve('pkg/') would fail if the package didn't have a main/dot, wouldn't it? or, if the ./ was mapped to a different directory, like ./src?

guybedford commented 4 years ago

@ljharb no, the trailing / is specially specified to allow resolving package boundaries.

ljharb commented 4 years ago

@guybedford require.resolve('es-get-iterator/') in latest node throws Package exports for '$PWD/node_modules/es-get-iterator' do not define a './' subpath. I would expect import.meta.resolve to behave identically, so the same capabilities exist in both CJS and ESM.

iow, import.meta.resolve only solves for ESM, not CJS, so it's not a solution to this problem.

guybedford commented 4 years ago

@ljharb yes, because trailing slashes in CommonJS still apply extension searching, which the ESM resolver does not, which is a fundamental difference between the module systems.

wesleytodd commented 4 years ago

People putting configuration data like secrets is more the concern (@bmeck)

This is a not a concern. Using exports to hide secrets is not ever a reasonable solution anyway.

Make it non-configurable. ./package.json always has to map to ./package.json. (@jkrems)

This is what I was thinking as well. Seems perfectly reasonable to enforce this constraint.

The first option would force every package to treat its metadata file as a public API. (@jkrems)

It already is. This is not a change in the ecosystem as it is today. Every file is part of the public api, and needs to be treated as such. If projects choose not to strictly follow semver, that is a different issue.

Here's the recommended way to do this with ES modules: (@guybedford)

We can also use hacks around require for this. The point is that the most ergonomic way is also popularly in use, so should be added as an exception to the exports spec, even if there are other ways around it.

I didn't see anybody trying to directly require('pkg/package.json') to really load the json data as a module. (@ctavan)

I have seen this many places. Although I am not going to spend the time now collecting references since I don't think this should be the primary focus of the discussion, if it become a key point I am happy to dig for them.

bmeck commented 4 years ago

I think allowing censorship is necessarily good and wouldn't feel comfortable with ./package.json always mapping to ./package.json would not seem to allow that. In particular, people do set environment variables in their ./package.json when deploying things in various environments that support it. Environment variables might be able to be removed at runtime via process.env but if the deployment does not have a writable filesystem they could not censor their package.json. I am not really here to judge if this workflow is a good idea, just to note that it does present a concern for myself.

wesleytodd commented 4 years ago

I think allowing censorship is necessarily good and wouldn't feel comfortable with ./package.json always mapping to ./package.json would not seem to allow that.

Making it more complicated for everyone seems to strongly outweigh this. I know the point of engines is to help package authors have more explicit contracts with their users, but if this just breaks everyone's tooling it is a net negative to the community, especially for the package authors who now have to deal with this added complexity.

I am not really here to judge if this workflow is a good idea, just to note that it does present a concern for myself.

Do you have examples of this type of workflow? The app developer use case is not what I was considering at first, so maybe there is some common practice I have not seen like this. If so we would not want to break it. That said, I feel like the current state before exports had no restrictions on this, so I am not sure how we would be making anything worse.

ljharb commented 4 years ago

I'm confused about why this is a concern; if you put secrets in a place on the filesystem that the node user can access, your secrets are already exposed. exports is not a security feature, as we discussed many times during its development.

guybedford commented 4 years ago

I think the problem is more about making the package.json file part of the public API of a package.

The goal of exports is to fully encapsulate the public API of a package in a way that allows sound analysis of execution, optimization, breaks etc etc.

Exposing the package.json goes against this by making the properties of the package.json part of the public API.

There are many ways to access the package.json otherwise - you are not stopped from doing it, it just takes a little more code. Updating require.resolve patterns to a fs.readFile pattern is all it is.

Also note that this mostly applies far more to frameworks than libraries. Frameworks can at least take the effort to understand the problem here and fix the root cause I'd hope.

wesleytodd commented 4 years ago

Exposing the package.json goes against this by making the properties of the package.json part of the public API.

I think the goal would be to explicitly document this fact (and codify it as part of the implementation). Just call it part of the public api, always and forever, and be done with it. And to be clear, adding exports broke the existing behavior which was that all files in a package are part of their public api. So going back on one clearly good exception seems to be a more reasonable middle ground than breaking every tool which relies on this today.

Also note that this mostly applies far more to frameworks than libraries.

Not sure I understand the distinction here. I have libraries which load package.jsons to inspect them via require.

ljharb commented 4 years ago

That’s the problem - you can’t update to a readFile pattern if you can’t get the path to the file robustly, in CJS. That’s not possible right now for a package with exports, that doesn’t include package.json, and whose dot/main either is set to false or points to a subdir.

rektide commented 4 years ago

This focus around package.json seems incidental to me. As a user, I would very much like to be able to require()/import items from the file system, which is the most apparent & comprehensive truth to me.

That this is not longer possible if there is a pkg.exports seems like a very critical degredation of what I as a consumer of modules would hope & desire. If package.json exports do export something, fine, I'll take that, but I should continue to be able to require/import files that a package distributes. Including package.json.

I beg node to please adjust course & not hide the file system the moment an author declares a package.json exports.

ljharb commented 4 years ago

That’s the entire purpose of “exports”, and it’s a highly desired one - that’s not something that we’re discussing here.

rektide commented 4 years ago

That’s the entire purpose of “exports”, and it’s a highly desired one - that’s not something that we’re discussing here.

well where do we discuss it jordan, because it's a bad choice & confusing for everyone? there should be room to fallback into actual real resources if not defined in this new abstract package.json system node invented for itself.

i don't see why we shouldn't have both. it would solve this issue. it would allow people who have for a decade now required()'d resources continue to do so when their package authors miss this or that resource. i think the package consumers deserve more than they are getting with this "highly desired" system.

guybedford commented 4 years ago

@rektide the full resources are still available at require('/absolute/path/to/package.json') exports only provides a filtering when entering the package through the public interface, via require('pkg/subpath').

If the problem is how to resolve the package path without having a suitable subpath, this is what the trailing slash was designed to allow in the example provided at https://github.com/nodejs/node/issues/33460#issuecomment-630452758.

If you don't like change, don't adopt exports.

guybedford commented 4 years ago

@ctavan you make a good point in https://github.com/nodejs/node/issues/33460#issuecomment-630454987. Perhaps one option could be to treat package.json as an exception in require.resolve ONLY (and not for require), where on a PACKAGE_PATH_NOT_EXPORTED error an internal fallback resolution approach applies.

I would not want such a path implemented for import.meta.resolve though.

ljharb commented 4 years ago

Anything require.resolve resolves must also be obtainable via require, otherwise the entire thing doesn't make sense.

guybedford commented 4 years ago

Having it just for require.resolve would definitely be an inconsistency in the name of backwards compat practicality, yes.

ljharb commented 4 years ago

To me the viable options are:

  1. do nothing, eventually packages all have to add ./package.json to "exports" in order for tools to work
  2. ¯\_(ツ)_/¯ import.meta.resolve with a trailing slash, only works in ESM, which forces option 1 anyways
  3. make "./package.json": "./package.json" an implicit part of exports, forcing you to do something like "./package.json": false to opt out
  4. provide a new API, that works in both CJS and ESM, that gives you the path to a directory without respecting a package.json (so that you can do something like readFile(path.join(packageDir('package'), 'package.json')) (note: this would not have any semver/API implications on a package, and would mirror import.meta.resolve('package/')'s behavior)

I understand why the third option (implicit) is not desirable. I do not understand, however, why we find either options 1 or 2 acceptable, and I think it's worth exploring option 4.

In the meantime, it would be ideal to update the documentation for "exports" to address this likely-common hazard.

ctavan commented 4 years ago

Let me add some thoughts from the point of view of a heavily depended-upon npm package like uuid.

That package provides 4 named exports (none of them qualifying as a default export) and a typical user will only need one of them in their project. Until ESM became a thing, and with it a robust way of performing tree shaking for browser bundles, we recommended to our users to deep require the respective files like const uuidv4 = require('uuid/v4'); which effectively required the v4.js file from the package root.

With widespread adoption of ESM in browser bundlers (which is where treeshaking matters), we decided to move away from this deep-require-API and instead started encouraging users to import { v4 as uuidv4 } from 'uuid';.

Getting rid of the deep requires forced our users to adjust their code and resulted in lots bogus "bug" reports of users who apparently ignored our deprecation warnings (which even contained a link with exact instructions on how to upgrade).

So from the perspective of an author of a popular npm package it is indeed a very desireable feature to be able to restrict the public API to a minimum: Simply for the fact that you may get rid of bogus bug reports that exclusively result from unintended require-usage of some users.

Now regarding tooling-config in package.json: I think it was probably a bad idea to put framework/tooling-specific configuration into package.json in the first place. It would probably be much better if packages which need to expose framework-specific configuration would do so in an explicit manner, e.g. like explicitly adding ./reactnative.conf.json, ./svelte.conf.json, ./rollup.conf.json to their exports.

Unfortunately that's not how it went, package.json has always been a "reliable" source in the sense that any tool could rely on its mere existence.

So I'm of two minds here: I think in principle this is a nice opportunity for cleanup of package's public APIs. On the other hand this is likely to cause significant cost across the community.

I agree however that as a compromise @ljharb's option 4 is worth exploring. And in any case I totally think that the documentation should be more explicit (which is why I raised this issue in the first place).

ctavan commented 4 years ago

FWIW copying an interesting suggestion by @SimenB from https://github.com/browserify/resolve/issues/222#issuecomment-630639258:

For the "I need to load package.json" use case, why not do something like this:

const {sync: pkgUp} = require('pkg-up');

const packagePath = pkgUp({ cwd: require.resolve('uuid') });

https://github.com/sindresorhus/pkg-up

Could trivially be packaged up into a pkg-of-module and published to npm if people want.

I guess it could be wrong if people have nested package.jsons, but that seems like an edge case we shouldn't care about 😀

Originally posted by @SimenB in https://github.com/browserify/resolve/issues/222#issuecomment-630639258

eemeli commented 4 years ago

Nested package.json files should not be discarded, as they may become more prevalent once people realise that they can include a { "type": "module" } package.json within a subdirectory to allow using .js extensions for ES modules within an otherwise CommonJS package.

ljharb commented 4 years ago

When trying to get the package’s root package.json, nested ones are irrelevant; but that they may exist highly complicates any workaround to discover the package dir reliably.

wesleytodd commented 4 years ago
  1. make "./package.json": "./package.json" an implicit part of exports, forcing you to do something like "./package.json": false to opt out

But no opt out is my preference. If a package author doesn’t want some part of their metadata public, move it to another file. I don’t see why this is so contentious, and I have not seen a single real world case where hiding the package.json is reasonable.

GeoffreyBooth commented 4 years ago

I sympathize with the use case, but I also don't want to take away package authors' ability to define their public API as not including package.json if they wish; and the mental overhead of anyone using "exports" to have to read about/know about this exception is bad UX. And since JSON files aren't import-able, the current pattern doesn't translate well into ESM.

That said, I think there should be a way to get this information easily enough, so if that means a new API, then so be it. import { getPackageMetadata } from 'module' where getPackageMetadata('pkg') returns the root package.json for pkg? Sure, and it could work identically in CommonJS as in ESM. How is this any different or better than just making an exception to "exports"? In my mind, this is equivalent to fs.readFile, in that "exports" already explicitly allows any-file access when people aren't using require or import, so this is the same thing. Or if we can use resolve somehow to achieve the same goal, that would be fine too. We could even create a new API, like import { resolvePackageRoot } from 'module', that would only look at the bare specifier (resolvePackageRoot('pkg')) and throw on any strings that contained slashes; therefore it could also function identically between CommonJS and ESM.

ljharb commented 4 years ago

@GeoffreyBooth either of those two APIs would indeed solve the issue without risking implicitly expanding a package's API.

wesleytodd commented 4 years ago

I would not oppose a new api in the long run. That said, there is a clear downside to a new api which is not present in the proposal to forcibly include the package.json in the package api:

It requires maintainers to migrate their tools.

This is a HUGE burden on the entire community. Things will be missed, issues will be filed, people will spend time migrating. All of this can be avoided by just keeping the existing behavior for package.json files and let them always be required.

Until there is a clear real world example of why a package author would ever have a reason to restrict access to the required package metadata, I am not sure why we would consider adding an api and all the associated work.

jkrems commented 4 years ago

Maybe two functions:

// Get the relevant metadata for the given resolution.
// Could either reject or resolve to `null` for relative specifiers.
getPackageMetadata(specifier, referrer) -> Promise<PkgJSON>

// Look up package metadata that would be used to interpret the given URL,
// e.g. to look up how to interpret yoloscript files (.js).
getPackageScopeMetadata(resolved) -> Promise<PkgJSON>

These could be provided in userland but I think it would be worthwhile to expose in node itself. If it's in node, we'd likely return a copy of the metadata.

jkrems commented 4 years ago

It requires maintainers to migrate their tools.

Tools that check package.json for packages that use exports are exceedingly likely to need to migrate one way or another. Node added a feature to the resolution, tools that don't change will diverge from what package authors may expect now. So... it almost feels like an upside when there's a clear error that tells users that the tool hasn't been updated to properly support one of their dependencies.

wesleytodd commented 4 years ago

Tools that check package.json for packages that use exports are exceedingly likely to need to migrate one way or another.

There are many reasons to load a package.json, many of which should not have to change. I agree this one case, if you are loading the package.json to decide how to resolve source files, is guarnted to require tool changes, but no need to break all tools just for the one use case, right?

GeoffreyBooth commented 4 years ago

Re the tools needing to update, two things:

eemeli commented 4 years ago

I propose that it would be entirely valid to expect current package.json readers to fail more gracefully when being denied that opportunity by require() or require.resolve(). If they have configuration that they might be interested in reading from a package, being denied that opportunity should equate to that configuration not being defined in said package.json.

If, on the other hand, the package does include some configuration in package.json or elsewhere that might be considered a part of its public API, those paths should be expected to be included in the "exports".

wesleytodd commented 4 years ago

There aren't that many tools that would need to update.

This is an optimistic take IMO. Even if we assume that it is relatively small, it sill would take all the users to upgrade to versions which support the new behavior. This will be a long migration no matter which way you slice it.

Creating an exception for package.json would enshrine what some on this thread consider a bad pattern

The package.json has had a special position, and will continue to for as long as I care to look into the future. We have "enshrined" the package.json as the required place for packages to put the required metadata in may other ways, this will not make it any more enshrined than it already is.

Secondly, there is a big difference between "loading package metadata" and "loading random configuration data". The first is required for tooling, and works great today. The second is an unintended side effect, which has great ergonomic advantages, so package authors have decided to leverage this. In one fell swoop, this addition will take away both of these features. I agree that arbitrary additions to package.json can have issues, but purism of avoiding an exception is pointeless when this feature gives so much value to the ecosystem. And loading package metadata is just a requirement, this should be something we codify/enshrine, as it is a very important tooling feature.

MylesBorins commented 4 years ago

Haven't had a chance to read everything but wanted to note that package.json not being included in exports is already documented

https://github.com/nodejs/node/blob/master/doc/api/esm.md#package-entry-points

One thing we could do as a start would be make a specialized error message.

MylesBorins commented 4 years ago

oops, butter fingers. sorry for the close

ctavan commented 4 years ago

For reference the documentation was updated in this commit: https://github.com/nodejs/node/commit/1ffd182264dcf02e010aae3dc88406c2db9efcfb

From my perspective as the reporter of this issue the relevant new part (not yet released on https://nodejs.org/dist/latest-v14.x/docs/api/esm.html#esm_package_entry_points, @MylesBorins I'm sorry I only checked the released docs, not master) covers the documentation part of this issue 👍 :

Warning: Introducing the "exports" field prevents consumers of a package from using any entry points that are not defined, including the package.json (e.g. require('your-package/package.json'). This will likely be a breaking change.

A special error message would have helped me personally since, as explained in the original issue description, I thought I was well aware of the new exports behavior but somehow considered package.json meta information and assumed the exports restrictions wouldn't apply…

farwayer commented 4 years ago

Haha, I'm happy that I am not the author of any bundlers or tools that need information from package.json. And I don't need to think how to resolve path to it with new export behavior. As I said in one of upper issues: module authors will not include package.json to exports.

ljharb commented 4 years ago

@farwayer i have 250+ modules, and although only a few have "exports" so far, every single one of them will include package.json in its exports.

wesleytodd commented 4 years ago

every single one of them will include package.json in its exports.

To me this is exactly the problem. If all responsible module authors have to add it, then what is the point of not making it baked in?

ljharb commented 4 years ago

Note that I won't be doing it because I have to necessarily, but because I don't see the point of not exposing it.

farwayer commented 4 years ago

@ljharb You are good module author, read nodejs docs and think about tools developers :wink: Many of other will not think about exporting it. package.json was only one stable point. When exports will be widespread we can't depends on it any more.

I like your idea about method for resolving package path. But I have another idea. What do think about adding new universal API for providing all package meta information? path, type, exports from start, maybe name, version, description and license if available? It can be super useful for tools because this will allow to refuse direct reading package.json file if no extra info is needed. And it can be extended in the future.

mcollina commented 4 years ago

I think the best path forward is to always publish package.json as it includes important information about the current module. In that way, we won’t need to add new APIs to core, and most things would work as is.

(I just stumbled upon this, and it seems a pretty valid concern, I think we should change it).

Slayer95 commented 4 years ago

Any chance of working with package managers to update the behavior of package.json generators?

For example, when the interactive npm init command is run, the automatically generated package.json file would contain not only a main entry, but also an exports entry exposing the package metadata.

{
  "name": "test",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "exports": {
    ".": "index.js",
    "./package.json": "./package.json"
  },
  "scripts": {
    "test": "npm test"
  },
  "author": "",
  "license": "ISC"
}

Then there would be no magic whatsoever from Node.js side, yet the ecosystem would get their preferred defaults. Getting tooling to solve the problems of tooling sounds fair, right?