zkat / npx

execute npm package binaries (moved)
https://github.com/npm/npx
Other
2.63k stars 105 forks source link

skip reinstall for performance reasons #113

Open andreineculau opened 7 years ago

andreineculau commented 7 years ago

Looking through the code, npx doesn't reuse previously installed packages so npx npm@3 --version && npx npm@3 --version is actually going to call npm install twice, in 2 different folders marked by the PID, summarized by: install/decompress, run, remove (via rimraf).

Has there been any thought around how to optimize this process, so that the first call could keep the files in place, and the second call could get a flag to turn on preference for using already deflated files?

zkat commented 7 years ago

Yep. This is a pretty regularly requested thing, and something I want. I just want to make sure that npx can still provide a good user experience with all the corner cases this can involve.

I have some ideas on the constraints here, but I think until those are covered, it's safer to do temporary installs.

Keep in mind: the expected model for npx is that if you want to use something in your app multiple times, you should install it as a devDependency. If it's a global thing you want to use regularly, you should npm install -g it. That's why this is relatively low priority for me.

masaeedu commented 6 years ago

If it's a global thing you want to use regularly, you should npm install -g it.

@zkat The difference is that npm install -g pollutes the global PATH. If my non-Node executable/library wants to use some CLI tools available on npm (e.g. a stdin/stdout JSON-massaging tool), npx provides a good way to do that without leaving permanent effects on the user's system (besides presumably some cache files in /tmp).

The reason I can't use it right now is that the installation process running on every invocation of the tool makes things prohibitively slow.

zkat commented 6 years ago

You don't have to pollute the global path: you can do a more permanent version of what npx itself is already doing by using: npm i -g --prefix ~/.local/npm-tools and adding ~/.local/npm-tools/bin to your PATH temporarily.

I still do not think "just" having npx keep package installations around will provide the target user experience I'm looking for when it comes to temporary installs. Does this alternative work for you, as a balance between having more permanent binaries to run, and not polluting your global install?

Note that npx still keeps a copy of the cached packages it downloads in ~/.npm (by default), which is the global npm cache, just by virtue of using npm under the hood. Having npx keep things around more permanently would obviously necessitate that you keep those packages installed more permanently so it's not gonna help you if you're trying to avoid general pollution.

masaeedu commented 6 years ago

@zkat This is similar to a workaround I was trying to set up, which was to create a folder in /tmp/<app-specific-prefix> and before every invocation of an npm-provided tool, run npm install in there, then pass that folder as --cache.

This would work, but it's a little awkward. The tool I have right now is stateless; one invocation is no different from the next. If npx could hide the state management of idempotently downloading the requisite packages before executing them, that would simplify my life a lot, since I don't need to distinguish between when packages are downloaded and when they are not. Right now the only difference is in performance, but the difference is big enough that I can't neglect it.

I would imagine similar problems would arise with the npm i -g --prefix approach you're suggesting: I either need to have some special setup step for my tool where I seed this cache, or call it on every invocation with the expectation that redundant invocations have no cost.

Regarding the cached packages getting downloaded to ~/.npm, the pollution I'm concerned with is mostly user-visible pollution (name conflicts, etc). I'd actually prefer if npx worked in /tmp so nothing is pinned and everything gets blown away, but I guess that's a discussion for a separate issue.

masaeedu commented 6 years ago

I'm not sure I entirely understand you though: if you are preserving downloaded packages in ~/.npm after I do npx -p typescript tsc, what work are the progress bars representing the next time I run it?

zkat commented 6 years ago

@masaeedu the work you're seeing is npm installing those packages. The thing that gets cached is the tarballs.

masaeedu commented 6 years ago

I'm not familiar with the terminology, but I guess "installing" comes down to extracting the tarball, recursively downloading + "install"ing dependencies, and running post-install scripts. Is there perhaps a simple way to cache this work as well? E.g. could this maybe be done in a /tmp/<npm prefix>/<hash of tarball> folder, which npx then tries to consult before doing the work?

zkat commented 6 years ago

@masaeedu caching post-install artifacts safely in a way that doesn't add an enormous amount of debugging complication for the user is non-trivial. I'm avoiding that, because I want npx to be more of a reliably/easy to grok tool.

The sort of caching you're talking about is the style that I'm considering, but it needs to be paired with a bunch of other checks to make sure that's actually the latest version of all the packages you would've installed (otherwise, it would violate what I consider to be an important invariant of npx, where doing a tmp install always gets you "latest and greatest". It helps debugging.

I don't think I understand what your tool is actually doing if the --prefix method is not enough. That method only requires that you invoke npm yourself (which seems to be the case with a wrapper tool), and for your tool to make a small, temporary adjustment to PATH while it runs. The files installed are invisible to the user that way, as you wanted.

If you want to avoid duplicate calls, all you have to do is PATH="PATH:~/.local/npm-tools/bin" which <your-bin>, and you only do npm i --prefix ~/.local/npm-tools -g <your-tool> if that which call fails or gives you the wrong path. You don't need to run npm unnecessarily, and this is a very similar check to what npx itself would be doing.

masaeedu commented 6 years ago

and this is a very similar check to what npx itself would be doing.

Yes, and this is the only reason I'm asking for it to be rolled into npx itself. The prefix approach you suggested would work, it's just a question of implementation complexity and where it gets pushed. I hadn't considered that I could just use which to avoid unnecessary npm installs. I'll test this approach out, thanks.

The tool I'm writing is basically make in YAML: a way to for users to write tasks like { exe: "bash", script: "<multiline bash script>" }, { exe: "python" ... }, and pass around data using env vars and fifos. The user is free to invoke arbitrary things in the script, including (conveniently) npx based tools, which obviates the need for me to provide any functionality for massaging JSON or running webpack. The problem is that this gets slow, because every invocation of npx json starts doing the install work you mentioned. I need to either:

  1. recommend that users add this npm i --prefix business as a task in all their task files, then depend on it in every task, or
  2. recommend that they do the which checking stuff in every task before they try to call an npx script

It'll work, it's just less fun than a magic all-powerful npx that behaves as if I had npm i -gd all of npm already.

kamranayub commented 6 years ago

👋 First-time npx user here and I just want to say I was very confused by this. I expected npx to just use whatever version I had installed, if not from my first npx invocation, then at least throughout my terminal session. Totally understand the issues, though but I did just want you to know it bothered me so much I thought it was a bug and searched here to find this 👍

ghost commented 5 years ago

I also came here out of confusion of the intent of npx - dont get me wrong though, it's great - just learning.

I'm looking for this logic: recognizedCommand('my-module') ? my-module arg1 : npx my-module arg1;

My use case is just me having something like jest in my npm scripts where I don't want to have it installed globally. I thought that npx was sort of a combo between 'installing if didn't exist and just run the command if exist' where its actually more like "try to install each time and then run"

I can just access my node_modules/.bin/my-module of course but this doesn't always seem to work - I was able to do it with one module, but this other module i'm using doesn't appear to want to be executed from the .bin folder :/