Closed GeoffreyBooth closed 1 month ago
Can you just comment the proposal here? This repo only exists to document what already exists; it doesn't make sense to make a PR until that's the case.
I agree and think we should write this proposal up as prose first. I think we set the initial scope to be "documentation only" as a way to take off a good split task that was not huge scope. Technically this repo is not even for that documentation, this one is: https://github.com/openjs-foundation/package-json-research
Okay, how about something like this. We could use the name devEngines
and it would be defined with this schema:
interface DevEngines {
runtime?: DevEngineDependency | DevEngineDependency[];
packageManager?: DevEngineDependency | DevEngineDependency[];
}
interface DevEngineDependency {
name: string;
version?: string;
os?: string[];
cpu?: string[];
onFail?: 'ignore' | 'warn' | 'error' | 'download';
download?: {
url: string;
hash: string;
}
}
Both runtime
and packageManager
sub-fields could either be an object or an array of objects, if the user wants to define multiple acceptable runtimes or package managers. The first acceptable option would be used, and onFail
would be triggered for the last defined option if none validate. Validation would check the version ranges, OS and CPU constraints.
The runtime.name
field would be a string from the list of WinterCG Runtime Keys. The packageManager.name
field . . . I don’t know, we could define the list as part of the spec, or perhaps it would need to correspond to an npm registry package name. Suggestions welcome.
The version
field would match the definition of the value of engines.node
, so something like ">= 16.0.0 < 22"
or ">= 20"
. If unspecified, any version matches.
The os
and cpu
fields would match their correspondents in https://docs.npmjs.com/cli/v10/configuring-npm/package-json#os and https://docs.npmjs.com/cli/v10/configuring-npm/package-json#cpu. If unspecified, any OS or CPU matches.
The download
field would apply only for onFail: 'download'
. I assume no one would support for runtime
. It could also be supported on a case-by-case basis, like perhaps Yarn and pnpm could support downloading a satisfactory version while npm would error.
I think this covers all the ideas from https://github.com/nodejs/node/issues/51888#issuecomment-1967102442 and https://github.com/nodejs/node/issues/51888#issuecomment-1967141092. cc @wraithgar
onFail
is a user directive, not a maintainer directive. It would be unwise to let the maintainers dictate that behavior. A good historic example of why this is not advisable is engines
itself. There are many packages out there that specify an engines that looks something like node: "^8.0.0"
. The maintainer likely had every intention of keeping that up to date but now that package doesn't work in node 20. If they also had been able to specify "fail" then the package would stop working, even if it works fine. Because packages are idempotent this can't be fixed. The end user is ultimately the only one who can make the decision on whether or not this constraint is a blocker.
onFail
is a user directive, not a maintainer directive.
But in this case, the user is the maintainer. This whole configuration block only applies when it’s the developer of the package/project.
How does download
even work with a semver range? If these are all npm registry packages then there is already a well defined discovery mechanism for downloading a given named package at a given version.
I'd love to hear from yarn or pnpm if and how they'd plan on handling the failure state where the user has indicated they want to download the correct version. How do you pick the version? Do you spawn it yourself? Why not fail and let the user remediate?
But in this case, the user is the maintainer
Good point. As you can see the lines between "installing as module" and "interacting as app" are so blurry that even I missed that.
How does
download
even work with a semver range?
If a URL is provided, we’re just downloading that exact URL. The version constraint wouldn’t apply.
If we want to allow onFail: 'download'
and a missing download
field, the only behavior I can think of would be to query the npm registry for the requested package manager and download the newest version that fits within the range.
Yeah to be clear all of this is for the maintainer. We should include that this proposal would not effect any install behavior other than for the top level project.
I moved my draft into the top post to make it easier to find.
This is a great start. I think it's important that it not just be limited to the runtime and the package manager - process.versions
has a lot of things in it that would be really nice to also constrain in various cases, especially openssl which has lots of CVEs.
Tagging relevant people to review the draft spec: @Ethan-Arrowood @arcanis @zkochan @aduh95
I am not very good at reading specs like this. What would it look like in implementation to limit the local dev environment to a given os or cpu for example? From what I see it doesn't look like that field is sufficiently decoupled from "named" which doesn't make sense in that context.
packageManager
is already used in the wild as a string, we'd need to define what happens when the value is a string to allow a smooth transition – another option is to use a different key, but packageManager
is quite good a describing what it is.
but packageManager is quite good a describing what it is .... is already used in the wild as a string
This proposal covers way more than package manager and would not be under that field name (although it is not mentioned above). I think that should have always been the goal, so that is why keeping the field "experimental" in node core was important. I agree we would want to smooth the path. But I think that is up to the folks who implemented the field in their tools. Maybe here we would just discuss goals for ways to make that transition?
The first step should be figuring out (such that they can be clearly documented and conveyed) the problems we're solving, and the goals and non goals - and then to sketch out a solution that seems viable to all stakeholders.
At that point, those interested in the migration path from the experimental packageManager
field to the new thing can explore the best way to achieve that - but until the solution design is settled I'm not sure how any progress could be made on migration.
Cross-posting from Slack:
I’ve attempted to read all of the various GitHub and Twitter threads that have led us here. I think there is a lot going on in general, and I don’t have a clear, concise reply just yet. For now, all I will say is that I encourage all invested parties to join the monthly call on March 5th. We rarely have much to discuss regarding the group’s active work streams, so we can reasonably spend the entire hour discussing the corepack issue.
For those looking for the upcoming meeting invite, please copy the invite from the OpenJS Calendar
What would it look like in implementation to limit the local dev environment to a given os or cpu for example?
I added those fields because you had mentioned them on the other thread. I'm not sure what the use case would be; run npm on Windows and Bun on macOS? I can just remove them if you can't think of a use case either. We can always add fields later.
I think that os/cpu etc are in the category of "general engines worth explicitly giving a range for", and altho "runtime" and "package manager" are reasonable to call special, the schema needs a mostly free-for-all dumping ground too (like engines
, import.meta
, etc are all now) for things like this.
Yeah, remember we're defining a spec for all development runtime constraints, not just package manager. Package manager is a subset of constraints. Node version, os, cpu, and libc are other constraints.
Why would you allow specifying multiple different package managers? I assume there are still people in this group that are strongly against lockfiles. But at least don't motivate devs to use different package managers on a project. It will make everyone's life a lot harder.
Why would you allow specifying multiple different package managers?
My guess is this is done to remain flexible, not make a recommendation that folks should. That said, I could think of a few cases where maybe you want to use pnpm
for local dev but npm
for publish or something of that nature?
I assume there are still people in this group that are strongly against lockfiles.
I don't think that would be a driving part of the decision is it? I assume you are meaning that if a package declared multiple dev package managers they would then have to resolve the lockfile differences? I think at this layer of abstraction (a spec for the package.json) we would not want to exclude package managers from agreeing on a shared lockfile format (in fact we might want to encourage that 😎) so I think it is reasonable to keep.
How about changing “hash” into two fields: “algorithm” and “digest”?
How about changing “hash” into two fields: “algorithm” and “digest”?
That’s fine, what would be an example? This?
"download": {
"url": "https://registry.npmjs.org/@yarnpkg/cli-dist/-/cli-dist-3.2.3.tgz",
"algorithm": "sha224",
"digest": "16a0797d1710d1fb7ec40ab5c3801b68370a612a9b66ba117ad9924b"
}
algorithm
and digest
should also be optional (but encouraged for the implementing tools to add/update them).
What's the "hash" currently? In npm lockfiles it's the output of ssri.stringify
which includes both algorithm and digest
For folks who need it (me) here is what ssri
references: https://www.npmjs.com/package/ssri
EDIT: I should have added, this is an implementation of the subresource integrity spec used in browsers as well: https://w3c.github.io/webappsec-subresource-integrity/
So with that in mind, I would say that seems like it could be a good spec to just directly here.
My example came from https://github.com/nodejs/corepack#when-authoring-packages. Per https://github.com/zkat/ssri/blob/latest/README.md#--ssriparsesri-opts—integrity, you can run ssri.parse(sri)
where sri
is either a combined string like sha224.953c8233f7a92884eee2de69a1b92d1f2ec1655e66d08071ba9a02fa
or an object like { algorithm: 'sha224', digest: '953c8233f7a92884eee2de69a1b92d1f2ec1655e66d08071ba9a02fa' }
Yeah I would propose we directly commit to supporting exactly what this ssri
package implements. So if there is a call out we can reference great, if not we can just make the original proposal reflect that it supports both formats.
I think this is probably close enough that PRs could be started to implement this? I think one each for npm and Corepack, and the implementation process should shake out any remaining questions. And if other package managers want to read the field directly too, similar to how npm will, they can always add support.
For npm, there’s https://github.com/npm/cli/pull/7253 which I think could be updated to support the above schema?
@styfle, do you think this is ready for Corepack to implement? You’d have to decide what it should do when both this field and packageManager
are present (like either have the new one take precedence, or maybe error only when they have download definitions that mismatch). Is this something you want to take on?
I think we are confusing fields a little too much still. Things that are relevant to packageManager are usually only relevant in that context. The other things generally don't affect that one.
How about something that builds off of engines
, wraps the os
etc back up into it, and gives packageManager
a proper namespace instead of trying to put them in the root object indexed by names (like we do now with engines.npm)?
interface DevEngines {
os?: string[];
cpu?: string[];
libc?: string[];
node? string;
packageManager?: DevEngineDependency | DevEngineDependency[];
}
interface DevEngineDependency {
name: string;
version?: string;
onFail?: 'ignore' | 'warn' | 'error' | 'download';
download?: {
url: string;
algorithm?: string;
digest?: string;
}
}
The root os
, libc
, etc and engines
would be (eventually) only evaluated at install time.
@wraithgar Re the new proposal, I don’t think it’s an improvement over what we have for a few reasons:
node
is not an improvement over runtimes
. Developers should be able to define what runtimes should be used for developing their projects, so that a Deno project isn’t run by Node etc.runtimes
field, it follows that os
and cpu
(and potentially libc
, but I didn’t include that yet) would go under runtimes
. Since certain runtimes don’t run on all platforms, like Bun doesn’t currently run on Windows and I assume Cloudflare Workers probably don’t either, etc., it’s very conceivable that a developer might define something like “Run in workerd
on Mac/Linux, in node
on Windows.”os
and cpu
under packageManager
; they’re there for consistency. But if you want to remove them and we make runtimes
defined separately from packageManager
, that would be okay.download
under runtimes
. A version manager like nvm
or n
or Volta or the like could choose to add support for this field and use it as a standardized place to define what runtime version to use, rather than each tool having their own proprietary configuration.It doesn't seem like a good idea to allow for arbitrary downloads from "not an npm registry". It's fine if an error message includes a URL but automatic downloads are going to raise a lot of security eyebrows (as they already do on corepack and npx).
I agree with all that @GeoffreyBooth posted.
It doesn't seem like a good idea to allow for arbitrary downloads from "not an npm registry".
I don't think we should dictate this and focus on the form and not the content/implementation. If a tool wants to support other things, fine. FWIW, I could see a really strong argument for companies downloading from an assortment of internal systems.
I could see a really strong argument for companies downloading from an assortment of internal systems.
I also don’t anticipate npm supporting onFail: 'download'
at all, so it’s not relevant for the npm implementation, which should be the first one.
@wesleytodd would the internal systems being "artifactory or similar" really be a barrier?
would the internal systems being "artifactory or similar" really be a barrier?
No, but I also don't see value in that restriction at this level. I think saying the contract is a tarball at a url then leave it to the tool to decide what it supports doing with that tarball. Which is really all that the registry api is providing anyway. I mainly just think this should be left up to tools implementing against this and not the proposal itself.
Actually, maybe even the contract is not "a tarball at a location" and is more "a string with meaning the tool understands". Which is even lower bar for what we "define as the spec".
EDIT: Probably an early send on that. My thinking for that was if a runtime manager wanted to support npm registry specs then they should be able to do that while keeping the same api contract in package.json
. Or am I missing something on this idea?
I totally agree that this spec shouldn't mandate anything but a string, for the reasons indicated - i'm suggesting that perhaps in v1 it shouldn't even have an implication that it's a URL - even though an implementation can do whatever they want with "a string".
The package metadata interop group meets March 5th I believe. This should be discussed there, and after that the npm team can work on an implementation. Likely using the existing devEngines
PR as a reference point.
I still don't understand how "os" is any less of a "runtime" directive than "package manager" but it's not an issue I'm willing to push back THAT hard on. I can remain confused if I'm the only one confused.
Also, as someone who has had to write the code to parse engines, os, etc I would humbly ask that we go ahead and bake in that these fields that can be arrays are always arrays. It's a lot easier to parse.
Also, as someone who has had to write the code to parse engines, os, etc I would humbly ask that we go ahead and bake in that these fields that can be arrays are always arrays. It’s a lot easier to parse.
That’s already the case for os
and cpu
. I think for the top-level runtimes
and packageManager
fields we really should support either an object or an array of objects. I think the vast majority of projects will want to define a single development runtime and a single development package manager, so we should add a quick Array.isArray
check to allow them to do so.
@zkochan Does pnpm have any thoughts on the proposed new package.json
field described in the top comment? Does this seem like something that, if npm added support for, pnpm would do the same? Are there any concerns about the spec or missing functionality that you would want added?
I prefer the packageManager
field. I don't think this new field is needed. My only problem with packageManager
is that it only allows exact versions.
In any case, I know I don't have influence in these decisions. If npm will use this field, pnpm will probably support packageManager
and the new field too.
FWIW I see no reason why the two fields could not cohabitate (if not indefinitely, at least for a while). As far as Corepack is concerned, it will likely defer to the package managers the parsing of devEngines
and turn off its "strict mode" by default (the feature that emits an error if someone tries to run e.g. pnpm i
on a project that has a packageManager
field referencing a different package manager). Just my personal opinion of course.
In any case, I know I don’t have influence in these decisions. If npm will use this field, pnpm will probably support
packageManager
and the new field too.
We’re inviting you to have influence 😄 This is the place.
npm also wants version ranges. While we could expand packageManager
to allow defining ranges, the npm team wants some other features such as specifying whether an out-of-range validation should error or warn. This is a bit much to squeeze into a string value, hence the proposal for a new field that takes an object. The idea is that this field could contain everything that all consumers could want; npm wouldn’t implement the download parts, and others could choose to ignore the warning
aspect if they always want to error, etc. The new field would be more flexible to handle additional use cases that will inevitably come up.
Yeah you have a lot of input and value in this conversation! That said, I would like to probe on why you prefer the former? Do you dislike the idea of this being used for node version specification as well? If so I am interested in your feedback on that.
Is there a shorthand we could align on? It seems like we're on the right path with the primitives but I doubt anyone would write these by hand (especially download
or onFail
) as they'd be prone to error/misconfiguration. I threw together a quick example of how I imagine this but it would be good to get a sanity check:
{
"devEngines": {
"cpu": "ia32",
"os": "linux",
"libc": "glibc",
"runtime": "node@>10",
"packageManager": "pnpm@latest"
}
}
The above assumes that runtime
& packageManager
values could reference any valid spec which can be parsed by something like npm
's npm-package-arg
& names would be associated with the configured registry by default (just like dependencies). This would mean that node
& bun
would be supported out the gate (because there are distributions for those in the widely used npm public registry today) but you could still define any url or alias for runtime
or packageManager
as a source/download.
Also, do I understand it correctly that all of these fields can/should be Array
s(?). I think that means that runtime
& packageManager
could be a String
, Array
or Object
(based on the schema in the top comment). I guess this would be helpful for scenarios like the one I mentioned here & folks may want to use more then one "package manager" in their projects or support more then one "runtime" for testing 🤷🏼♂️.
{
"devEngines": {
"cpu": ["ia32"],
"os": ["linux"],
"libc": ["glibc"],
"runtime": ["node@20", "bun@1"],
"packageManager": ["pnpm@latest", "npm@latest"]
}
}
The most likely of all scenarios though is that devs will just define a single runtime
& packageManager
(no different then how they do with engines.node
or engines.npm
today) so that should be as simple & straightforward as possible...
{
"devEngines": {
"runtime": "node@20",
"packageManager": "npm@10"
}
}
I'm +1 for arrays by default. Not a huge issue IMO to make even for the simple case:
{
"devEngines": {
"runtime": ["node@20"],
"packageManager": ["npm@10"]
}
}
This is a draft spec for https://github.com/nodejs/node/issues/51888#issuecomment-1967102442, to define the runtime and package manager for developing a project or package (not consuming/installing it as a dependency in another project). We could use the name
devEngines
and it would be a new top-level field defined with this schema:The
os
,cpu
,libc
,runtime
, andpackageManager
sub-fields could either be an object or an array of objects, if the user wants to define multiple acceptable OSes, CPUs, C compilers, runtimes or package managers. The first acceptable option would be used, andonFail
would be triggered for the last defined option if none validate. If unspecified,onFail
defaults toerror
for the non-array notation; or it defaults toerror
for the last element in the array andignore
for all prior elements, for the array notation. Validation would check the name and version ranges.The
name
field would be a string, corresponding to different sources depending on the parent field:os
: NPM docs forengines.os
.cpu
: NPM docs forengines.cpu
.libc
:glibc
ormusl
; see NPM docs forconfig.libc
.runtime
: WinterCG Runtime Keys.packageManager
. . . I don’t know, we could define the list as part of the spec, or perhaps it would need to correspond to an npm registry package name. Suggestions welcome.The
version
field syntax would match that defined forengines.node
, so something like">= 16.0.0 < 22"
or">= 20"
. If unspecified, any version matches.The
onFail
field defines what should happen if validation fails:ignore
: nothing.warn
: print something and continue.error
: print something and exit.download
: remediate the validation failure by downloading the requested tool/version.In the event of
onFail: 'download'
, it would be the responsibility of the tool to determine what and how to download, perhaps by looking in the tool’s associated lockfile for a specific version and integrity hash. It could also be supported on a case-by-case basis, like perhaps Yarn and pnpm could support downloading a satisfactory version while npm would error.Typical example:
“Uses every possible field” example:
Some potential future expansions of this spec that have been discussed are:
runtime
andpackageManager
might take shorthand string values defining the desired name or name with version/version range.