npm / npm

This repository is moving to: https://github.com/npm/cli
http://npm.community
17.54k stars 3.02k forks source link

RFC: New npm link commands #19421

Closed iarna closed 6 years ago

iarna commented 6 years ago

Refreshing the npm link command

This is a breaking change due to the all new npm link, the splitting of npm unlink from npm rm and the removal of npm install --link.

PROBLEM

The introduction of lock files and a synchronization workflow in npm@5 broke certain assumptions that the npm link and npm install --link commands took advantage of. They expected that they could replace a real module with a symlink to one of the same version and npm install would leave their symlink alone.

Starting with npm@5, any symlink is expected to be recorded in the package.json or package-lock.json as a file: type dependency.

The result is that any run of npm install after creating a symlink with npm link or npm install --link would result in the symlink being removed and a copy of the module installed in the usual way.

PROPOSAL

USE CASES FOR DEVELOPMENT ONLY LINKING

When fixing a bug in a dependency or transitive dependency, it is often desirable to be able to test it directly in the project that consumes it. With npm link you can create a symlink to the under-development dependency while continuing to otherwise work with your module normally. By persisting this to your package-lock.json it allows this decision to be branch specific

When working with a project with a number of tightly coupled dependencies it is often desirable to work on all of them simultaneously. npm link provides a facility for symlinking them and keeping them symlinked and the package-lock.json persistence means that new member of the project can get this configuration with little ceremony.

PACKAGE-LOCK

package-lock.json files have a new lockfile property on dependency objects: link

The value of link is a path relative to the package-lock.

CHANGES TO COMMANDS

npm link

With no subcommand, it is an alias for npm link create. With an invalid subcommand the error message should note the change and behavior and point the user at npm help link.

npm link create

Create all development links referenced in the package-lock.json. Failures due to missing link sources or missing package data (or invalid package data) are WARNINGs that result in no link being created. Version mismatches in linked modules are also warnings but they still create a link.

With no development links in package-lock.json warn with usage, but exit without error code.

npm link create module-name[@version-spec]

Search the package-lock.json for modules matching module-name and optionally version-spec, create links for those. If no modules match, exit with an error.

npm link clear

Remove all development links from your node_modules. This does not modify your package-lock, it just undoes what npm link create does.

npm link clear module-name[@version-spec]

Remove development links for modules matching the module-name and if provided the version-spec.

npm link add /path/to/module-name

npm link add /path/to/module-name version-spec

Create a new development link in your project for the module contained in /path/to/module-name. If a version specifier is included (eg ^1.0.0) then only modules that match it are replaced with a link. If no version specifier is included then ALL modules with the same name are replaced.

Links are saved in new link field on the dependency record they replace in your package-lock.json. The dependencies in the package-lock.json do not take into account any development links. Linked dependencies have their own separate dependency tree installed inside their own node_modules and their own package-lock.json.

npm link remove module-name

npm link remove /path/to/module-name

(Aliases: npm link rm, npm unlink)

Remove an existing development link from your project. This does not remove module-name—it removes the link field from the dependency record and replaces any symlinks to module-name with a normally installed copy of it.

npm install

npm install --link is no longer be a valid option.

npm install module-name@latest where module-name is currently a development link, will update your package.json and package-lock.json but will not do anything on disk. It warns that the link was left in place.

npm outdated

npm update

Outdated and update run against your package-lock.json and thus use the unlinked versions when determining what needs doing. Because npm update is built on top of npm install it gets the "don't clobber links" behavior automatically.

npm ls

Listing your install tree reports development linked dependencies as fulfilling their requirements. In the case where the version of the development linked dependency doesn't match the version in your package.json they are flagged visually.

npm pack and npm publish

npm pack and npm publish fail if any bundled dependencies are currently installed as a development link.

npm unlink

Is an alias for npm link clear. (Previously this was an alias for npm rm.)

CAVEATS

This specification of the new npm link command provides no facility for replacing a specific transitive instance of a dependency. Concretely:

If you depend on module-a and module-b and both of them depend on the same version of module-shared, there is NO facility for linking module-a's module-shared without ALSO linking module-b's module-shared. It SHOULD be possible to build a stand alone tool to support this, but it's out of scope for the initial npm link implementation.

ghost commented 6 years ago

What happens if I do npm link add /path/to/module-name but I have package locks disabled? Will npm install still delete those symlinks?

Overall I don't like how many more commands there are, but I can live with only using npm link add instead of making symlinks myself (which is what I do now, since I'm using a patched version of npm that doesn't delete symlinks and has auto-pruning of unused dependencies disabled).

I would prefer if I could configure npm to not delete any files. Then I can manage them myself using existing npm commands and system tools. I don't like having to memorize new ways of doing rm or ln -s. It feels frustrating to know what commands I could issue to change the contents of node_modules, but then have to reverse engineer the combination of npm commands (which seem to change somewhat over time) I have to use to achieve the outcome I have in mind so that npm doesn't blow away the changes I made to the file system.

alopezsanchez commented 6 years ago

Hello.

This is such a good approach! I have some questions/doubts:

Appart from this. Is there a roadmap for this development? Can we contribute?

iarna commented 6 years ago

@substack

npm@5 has a model that:

Syncs your package.json to a package-lock (possibly just in memory) and then syncs that to your node-modules.

That is, your package.json is the authoritative source for what should be installed and your package-lock is the exactly reflection of what we'll put in your node-modules. This let's us do things like verify hashes end-to-end, fixup very very broken trees, etc.

The problem we encountered is that symlinks became first class with npm@5 in the form of file: specifiers. This meant that you could save them to your package.json. As such, it needed to be able to know when a link was obsolete so it could remove them. You don't encounter issues with deleteing this because you don't use file: specifiers.

The solution presented here makes note of the symlinks in the package-lock so we know to keep them around. If you don't have package-locks enabled then the commands as outlined will refuse to run (because they'd be useless).

I'd like to better understand why you disable package locks? If you don't check them in they lock your node-modules down exactly as much as leaving your node_modules folder in place between runs of npm and no more, so I'm curious as to why they're a not an acceptable part of the solution?

iarna commented 6 years ago

@alopezsanchez

  • I don't understand why the npm link create alias could be useful. Is not it more intuitive just npm link/npm link module-name[@version-spec]? Maybe I'm understanding it wrong...

The main reason is to let us tell users of earlier versions of npm link about the behavior change. I don't want a command to appear to succeed but do something entirely different.

  • npm link remove module-name removes the corresponding field from the package-lock, right? Maybe you can be more specific with that.

Yes, that's right. I'll do that!

Appart from this. Is there a roadmap for this development? Can we contribute?

Once we've addressed concerns and worked through this RFC, I expect to be beginning development. The starting point for that, and it's one that folks could start on now, is to write spec compliance tests. If you're interested in that drop by #npm on https://package.community/ and I'd be happy to chat with you more about it.

felixfbecker commented 6 years ago

npm@5 doesn't really function properly w/o a package-lock.

lockfiles are awesome for applications, but I know a lot of open source libraries that .gitignore package-lock.json or add a .npmrc with package-lock=false, and there are good arguments that can be made in favour of doing so.

When the consumer npm installs the project, they will receive the latest dependencies, not the ones in the package-lock. So you want to avoid a situation where your build tells you everything is fine but really that dependency your PR just added breaks downstream consumers because they receive a completely different node_modules structure than your build system got when testing.

(EDIT: Image removed, please refrain from using image macros/memes in the issue tracker) Figure 1: Travis with the state of downstream consumers in the background

Yes, reproducible builds is a nice goal, but FWIW package.json always specifies a minimum configuration needed that is guaranteed to work if you need it. Yes, tools like Greenkeeper/Renovate can update the package-lock for you, but they can just as well notify you of a broken build and pin versions automatically in package.json (which actually fixes the problem for consumers), and usually the rare chance of a broken build is not worth the maintenance effort of constantly merging automatic PRs to update package-locks. It's all a trade-off of how you wanna spend your time as a maintainer of many OS projects.

Imo everything in npm should work whether a lockfile is present on disk or not (even if it has to be regenerated in-memory).

iarna commented 6 years ago

@felixfbecker Not having a package-lock but keeping your node_modules folder is exactly the same as just having the package-lock. Ditching the package-lock gives you no benefits. If you'd have rm -rf node_modules to clear deps previously then rm -rf node_modules package-lock.json will give you exactly the same effect.

So given that, can you explain what the benefits are of not having a package-lock?

(I'm open to suggestions on how to make arbitrary links work w/o a package-lock, but changing how file: works again isn't on the table, that ship sailed a year ago.)

iarna commented 6 years ago

(Please note: I get why you might not want to check a package-lock in. What I'm trying to understand is why you wouldn't generate it at all, which is a very different kettle of fish.)

felixfbecker commented 6 years ago

@iarna not sure what you mean by "keeping your node_modules folder", in CI I'd only cache ~/.npm between builds, never node_modules, since that won't work well if packages do postinstall things that differ between node versions or just native addon things.

As to gitignoring vs npmrc-disabling package-lock, not sure. npmrc feels more close to what happens in CI I guess. So you don't have to nuke both package-lock and node_modules when you want to replicate what a consumer would see. But good point that npm install will work the same with or without, I guess there is no real advantage to npmrc then.

ghost commented 6 years ago

That is, your package.json is the authoritative source for what should be installed and your package-lock is the exactly reflection of what we'll put in your node-modules. This let's us do things like verify hashes end-to-end, fixup very very broken trees, etc.

I don't want package.json to be an authoritative source of what should be in node_modules. Often, I need to install something quickly for debugging or experimenting and I don't want to accidentally forget to remove those dependencies from package.json later.

I don't use package-lock because I don't want to deal with the problems that checking a machine-generated and machine-modified file into version control generates. I also don't want other people to necessarily get the exact same versions of dependencies, I want npm to respect semver instead. I also don't like having yet another file around.

Most of the new features in npm 5 feel like they are made to support workflows that I don't personally use or like, which would be fine, except that many of these changes broke existing workflows which were simpler and it was easier to directly achieve outcomes without also making changes to other files. I liked how in previous versions you would have to explicitly tell npm install to --save a change into package.json. npm also wouldn't automatically prune dependencies, some of which you might have been using but had good reasons not to track in package.json. Many of these changes have taken commands that did single things like installing a package and have added additional, sometimes unwanted extra features, like how npm install ALSO will prune your dependency graph, whether you asked it to or not.

iarna commented 6 years ago

@substack "checking a machine-generated and machine-modified file into version control generates" who said anything about checking it in? Do you have any concerns about the file if you don't check it in?

ljharb commented 6 years ago

package-lock is explicitly disabled in every repo that i maintain that isn’t a top-level app, with prejudice. Any npm link replacement needs to work without a package-lock, imo. Only apps should have lockfiles - this isn’t something that really should be debated here, however; it should be sufficient that nonzero users prefer no package-lock to ever exist.

Having the file exist but not be checked in means that i have a different local experience than a dev who freshly clones, which is unacceptable. (Where “same experience” includes the desirable quality that an npm install at different moments in time does different things based on the state of the registry)

Perhaps a better place for this info is in package.json itself?

iarna commented 6 years ago

Relying on CI runs to detect issues in semver compatible updates to transitive dependencies is super ineffective.

Remember that CI only runs when you make changes to your code and in no way replicates what consumers see (as they might be installing this version six months from now). Or they might be doing it now but updating an existing copy and so get older but semver-compatible matches. Basically, I am of the opinion that holding on this is not a good technical choice.

Just to be clear, I view disabling the package-lock for any kind of project to be self destructive behavior. I acknowledge that not everyone agrees with me. =p

iarna commented 6 years ago

@ljharb

Perhaps a better place for this info is in package.json itself?

We discussed that… it gets weird because it means you may have transitive dependencies mentioned in your package.json. Does that seem ok to you?

ljharb commented 6 years ago

I would be completely content with that, yes - but also if top-level deps’ link info only lived in package.json, and transitive deps’ link info only lived in package-lock (thus refusing to link transitive deps without having a package-lock)

evocateur commented 6 years ago

I’m interested in the monorepo (lerna) implications of this. It sounds like a step forward toward the “golden path” of local sibling symlinks that transparently evaluate to semver ranges when published. (Not all the way, but with far less mangling of package.json files than I am currently envisioning with supporting npm5 file: specifiers in a way that doesn’t break downstream consumers).

I will noodle on this some more and try to sketch my thoughts into pseudo-RFCspeak tomorrow.

sompylasar commented 6 years ago

With this effort, can npm learn to publish to a local filesystem and install from there?

I.e. prepare (like whatever hooks happen prior to npm publish), package and publish ("upload" the build result to a temp dir like npm publish does), download, unpack, install ("download" to node_modules like npm install does).

I used to work with TypeScript packages which require compilation and non-built file cleanup prior to being usable as a dependency to another TypeScript package, so a symlink to the package source directory simply does not work the same way it would work if installed from a registry (a simple example: when another package builds, TypeScript finds all type definition files, and if they happen to be in the linked packages, this may cause conflicts).

I also made this tool to compare the locally built version with what was published to NPM, for that I primitively mimic the publish process and I'd rather use a built-in "publish to directory" command of npm.

I think these two features that are already implemented in core (as functions, but not as CLI) can solve the listed use cases with local development, and also would cover those that don't work with the symlink-to-source-directory approach.

Next thing to think about is hot-reloading that locally published package when the source files change (the linked directories will update naturally).

I'm open to discussion, and let me know if I'm missing something. Thank you.

n3dst4 commented 6 years ago

Is there a risk that someone will accidentally commit the link info in package-lock.json, when it’s really only relevant in their local computer?

What if you install an actual dependency while you have links active, and you want to commit the new dependency but not the temporary linkage?

dickensian commented 6 years ago

I concede that the existing functionality is neutered, however I recommend observing the interface segregation principle and instead of modifying the behaviour of link (some older projects still use older versions of npm), create a new set of commands with the new desired behaviour and begin the process to deprecate ‘link’ as it was.

felixfbecker commented 6 years ago

@iarna

Relying on CI runs to detect issues in semver compatible updates to transitive dependencies is super ineffective.

Remember that CI only runs when you make changes to your code and in no way replicates what consumers see (as they might be installing this version six months from now). Or they might be doing it now but updating an existing copy and so get older but semver-compatible matches. Basically, I am of the opinion that holding on this is not a good technical choice.

There are easy solutions for this. You can set up an automatic nightly build of master in Travis with a few clicks (Circle, Buildkite etc also all support this). Greenkeeper also runs a build on every new dependency update and will create a GitHub issue like "An in-range update of foo is breaking the build" with a one-click option to pin the dependency. As said, it's all a trade-off.

trygve-lie commented 6 years ago

Regarding the package-lock file; we've started ignoring them in modules too (not top level apps). Main reason for doing so is merge conflicts to be honest.

But; to stay on topic:

As I understand it from your RFC, npm ls will list all dependencies and which one are linked. Thats nice, but I would also like to be able to list out only linked modules. npm link ls or something.

It would also be nice to have some kind of notice if a module have a linked dependency. Preferly on npm install and perhaps on the npm run ... commands.

chriseppstein commented 6 years ago

This proposal wasn't really what I was expecting to see as a fix and it doesn't feel right to me. Here's my initial concerns:

  1. Maybe for a monorepo it would make sense to commit a link resolution to package-lock.json. but for the other workflows I use npm link for (which or far more common), this is never desired. It's going to show a file modification when I do git status. It's going to result in mistaken commits with paths that don't exist on our colleagues' machines. Ultimately, it feels weird to mix the concerns for transient development state with a file that is intended to represent the reproducible state for other devs when they do their install. possible solution: a package-lock.local.json file that is deep merged with package-lock.json so that resolution information that is purely local is kept separate and can be added to .gitignore.
  2. What happens when the version changes in a linked package? When is it noticed? What is the expected behavior if the new version causes it to no longer match the dependency spec of the consuming package? will npm update the symlink to an install? I suspect the answers to these are somewhat implementation dependent -- but I think it would be good to be intentional about this in the spec.
  3. I understand that this is carved out as a caveat, but I want to push back against that decision. The failure to have consistency for transitive dependencies is a mistake, imo. If the dev for an app A wants to do development and integration testing for a packages P and Q, both of which are direct dependencies of A, the ideal workflow is going to A and link P and Q to it. But if P has a dependency on Q, as I understand this RFC, they will also need to go to P and link Q to it. Worse, if P has a transitive dependency on Q and these are just casual contributions to projects P and Q, it maybe completely unnoticed that a second link command is needed. I can imagine a developer losing a lot of time to strange issues resulting from this until they ultimately realize the issue. At a minimum, the link commands ran on application A should surface these inconsistencies with a warning. But I wonder whether it's better for the resolution of A to be able to swap out dependencies in P on Q without P needing it's own link command. If I'm submitting several patches to multiple open source projects, ensuring that those packages land in a specific order is a level of complexity I'd rather not have to take on in most cases -- instead I want to be able to test that my changes to P work with the old version of Q as well as work with the new version of Q in an integration scenario (my app). This simulates what will happen when I install the new releases of P and Q and npm dedups Q and I get the newly released versions of both used by both in my app.
  4. Why can't npm support a simple ln -s: I think npm should do no harm when a basic symlink is encountered in any node_modules directory. That directory should be off limits to any mutation from the consuming package of the symlink. I experienced npm 5 removing files from a linked module's node_modules directory when it deduplicated shared dependencies. Then when I went to that module to run tests, it was broken. It's way better for that operation to fail if it must. There's existing tooling and build systems, etc that may use symlinks in their workflows and those should continue to work if they can or break explicitly. I don't see anything in this RFC that discusses safeguards for projects that have a symlink and no appropriately configured package-lock.json.
  5. linked dependencies have to be removed for the current package to be published? This would be incredibly inconvenient -- especially for multi-repo related packages. The motivation for this isn't provided in the RFC. I suspect this is because we want to make sure the package doesn't publish against unpublished code? I think there should, at the very least, be an override option that would allow the publish to proceed and any error message should tell the dev about that override option.
  6. Similar to point 3 above, but slightly different, what happens when I npm install a new dependency and that has a transitive dependency on a package that I currently have linked. I suspect that this will result in that new module using a non-linked version of the package. This would be pretty confusing and should be surfaced to the developer. My understanding is that npm link add is a point-in-time query against a current resolution and not an intent that is recorded and kept around as development continues -- but that intent is super important information and discarding it feels like a mistake.

Overall I feel like this design misses some very key workflows for node package developers (but maybe I'm just weird?), the use of package-lock.json as the single point of truth for what should be linked for a developer's workspace is, in my opinion, a mistake. I think that information should either go into package.json or some other location. That resolution information is, in many respects, a cache. I need to be able to nuke it and not lose valuable information (for instance, my experience is that the resolution gets locked to a repository. As a dev that has to do development against an internal repo for internal testing and then switch to the public repo for official releases -- nuking package-lock.json is the only way I've found for doing that).

I think package-lock.json gives a way to have different resolutions for different packages and for those resolutions to be conditionally activated depending on the developer's working directory when they are running tests, etc. This was a feature that we didn't have in npm <= 4 and it's actually very powerful, but I don't get the sense that it is being fully leveraged here. We've waited a long time to get a new version of npm link that is really nice -- I personally would rather wait a little longer and get a version that is really friendly, intuitive, and taking full advantage of the lock file.

iarna commented 6 years ago

@trygve-lie Did you know you can auto-resolve merge conflicts now? Give http://npmjs.com/package/npm-merge-driver a try. (It'll be shipping with npm sooon, but you can use it separately in advance.) It can be configured to do its magic for pnpm and yarn too and it eliminates the merge conflict entirely. (Which is different than other approaches that require you to run a command after getting one.)

npm link ls seems like a good idea—I'll add that.

And modules which declare their own links aren't installed when using them as dependencies. In fact, that data isn't even available as it's stored in package-lock.json which isn't published. If there was a shrinkwrap then it would be published, but would not be used.

iarna commented 6 years ago

@chriseppstein I'll come back to your other bits of feedback, but you're misreading the spec for #5: Links ONLY have to be removed IF you bundle the modules. Bundling modules is a very special case feature that is very rarely used. (It exists primarily for npm's use case, so you can bootstrap npm without already having it.)

iarna commented 6 years ago

@evocateur The next step on monorepos is to implement workspaces as specced by the yarn team. Once this is out the door I'll be writing up an npm-specific spec for how they might integrate with npm.

iarna commented 6 years ago

@n3dst4 It's intended that it be committed and shared. It does require some minimal coordination between teams as to where you place modules that you're editing, but not a lot given that they links are relative paths. Almost everyone has a folder that they clone modules into to work on, so links would almost always be ../module-name.

iarna commented 6 years ago

@felixfbecker You can also trivially setup your CI install with npm i --no-package-lock instead of npm i.

iarna commented 6 years ago

@trygve-lie Oh, I think I misunderstood you… by "if a module has a linked dependency" were you referring to the top level project? In which case, the spec already has it reporting that on npm install

ljharb commented 6 years ago

It seems like there might be some confusion; if a symlink is on disk, then what's on disk should override package.json - the source of truth for conflicts between a package.json and a non-npm-managed directory in node_modules is the disk.

I should be able to manually create a symlink and automatically have that considered by npm as a bundled dependency - ie, untouchable (without a --force flag, at least)

iarna commented 6 years ago

@ljharb Specifically what were you replying to? I don't know what you mean by "have that consider by npm as a bundled dependency - ie, untouchable"? I don't see how a bundled dependency could be described as untouchable? For the top level package, bundles are just a publishing rule. For dependencies, bundles source the package to the bundle (instead of the registry).

ljharb commented 6 years ago

Hm, my (admittedly untested) understanding of bundledDependencies is that it's a list of things npm isn't allowed to touch, with any command - not just publishing (altho the primary use case is publishing).

If that part distracts from my comment tho, please ignore it :-) it's more about the idea that files on disk that were not put there by npm, should never be touched by npm in any way (absent a --force flag, ofc).

iarna commented 6 years ago

Thanks for clarifying. I can't see npm ever following that model (tracking provenance of all installed items and ignoring one's not from npm), as updating from older verisons of npm (and from non-npm package managers) are both key usecases. It's also would be a substantial shift in behavior, matching no extant version of npm. (npm@2 would happily replace a direct dependency if it's version didn't match your package.json, even if you put it there separately.) It sounds like what you need is a new package manager. The good news on the front, for what it's worth, is that we're actively collecting the building blocks needed to put together custom installers that implement whatever logic you prefer.

ljharb commented 6 years ago

I'm not sure that's the case; it seems like prior to npm 5, these concerns didn't happen - symlinks (from npm link, generally) were not touched, existing directories whose package.json claims satisfied semver requirements were untouched, and extraneous dependencies were left alone.

giltayar commented 6 years ago

I believe most of the problem people are having with the new npm link is because there are actually two kinds of use cases for npm link.

  1. Mono-repo style, where you want the npm link to be persistent, and where it makes sense to use package-lock.json. (and I love how you added that the link info there! It makes total sense.)
  2. Ephemeral style, where you want to npm link elsewhere, just for trying things out. This can be to a package that resides in a totally different git repo. This is where you must not persist this link in package-lock.json as it is relevant only for that certain developer on that certain computer, and therefore should not be commited to source control. This is also relevant for people not using package-lock.json.

This RFC deals with the first kind (in a really nice way), but ignores the other use case. For myself, most of my need is in the first use-case (I am a heavy mono-repo user), but I definitely sometimes want an ephemeral link to a package I'm developing that is in another git repo I have on my disk, and which I must not persist.

I believe the solution to be a "simple" one (well, nothing is simple, but you know what I mean): I would suggest adding a --ephemeral option to npm link create that, instead of recording the information in package-lock.json, records the information about the link in a package-link.json that can be gitignored, and does not depend on the package-lock mechanism.

This way I can create links that persist in package-lock.json or ephemeral links that are relevant only for my computer. The important thing to note is that the ephemeral links use the new model of creating links (npm link add/create), with all its advantages, and yet for all intents and purposes follows the old pre-npm5 model of links (which people are clamoring for in the comments above).

ljharb commented 6 years ago

I'm not sure about the implementation, but I think the use cases are perfectly and succinctly stated - thank you.

iarna commented 6 years ago

Links without locks (one of the issues raised previously in this thread) are no longer trimmed with existing versions of npm.

I've moved the existing RFC (along with some additions to meet our new requirements for RFCs) over to the new RFC repo: https://github.com/npm/rfcs/pull/3

The final version of this RFC will still need to address the ephemeral links use case (and thank you @giltayar for summarizing it so clearly). (I've added that use case as an open issue that will need to be addressed before it's ready.)

Further discussion should happen over there. Thank you all for your participation thus far!

JDvorak commented 6 years ago

This thread has helpfully inspired me to disable package-lock.

JDvorak commented 6 years ago

For those only following this thread, I've written my more thoughtful thoughts here: https://github.com/npm/rfcs/pull/3#issuecomment-387897478