dominictarr / npmd

MIT License
450 stars 37 forks source link

npmd@2 #89

Open dominictarr opened 8 years ago

dominictarr commented 8 years ago

I have been thinking about a new design, which would make significant improvements. npmd uses a resolution tree that is based on npm's shrinkwrap. This was pretty good and enabled resolution and install to be decoupled.

But it ran into problems. It means, especially with --greedy mode (which is now the default install style in npm) you have to look at the package.jsons for all the already installed modules (which is slow) as part of the resolve process.

Instead, what would be better, is keep a file that represents the tree. .resolve.json or something. You could even just have that file contain a hash, keep all the trees in the content store, and check in that hash. then you could always go back to exactly the code you where using when you where working on that commit!

Also, when you wanted to install something, npmd would only open one file, and could instantly see what modules you already have or could be upgraded, etc. npmd would create a patch on that tree, and then npmd install would apply that patch. making for minimal disk accesses.

the file should contain the shasums of the modules that have been used, and what resolution tree we are currently working with, and what modules each node depends on.

Artoria2e5 commented 4 years ago

We are looking at two problems here:

  1. making a lockfile with shasums for stuff like caching (a name@ver should be enough for npmd-cache though, so uh mostly safety)
  2. read some package info without reading package.json

lockfile

Speaking from my yarn dep experience, the more-or-less modern way of handling a lockfile is basically:

  1. load requested packages ranges (already "greedy")
  2. throw the lockfile resolutions at the ranges
  3. keep the stuff that work, toss out the rest
  4. fill up the rest using whatever algorithm

(Yarn has a more "strict" way of handling lockfiles by associating the lock resolutions to specific requests.)

version reading

A way to skip the version reading for package.json is seen in pnpm: just encode it in the filesystem path. It installs packages to a global readonly (hopefully) store with the directory name name@version and creates symlinks in node_modules. Reading a link is going to be cheaper than parsing a JSON. Symlinks also make for a quick re-hydration given a good resolution.

(The problem with symlinks is that it may imply a shared set of sub-dependencies under a name@ver. This is ultimately where the conflict between a npm/yarn flat tree and pnpm's non-flat tree comes in. If we want both, we would need two global stores. They can still share data by symlinking the individual files and directories. Oops there's also the complication about how flat ones cannot be fully flat when two deps want incompatible versions of the same dep.)

We can additionally introduce the assumption that the version exactly matches the resolution if a package exists under node_modules at all, as long as the resolution is considered valid. There should be a switch to override it though.

putting it together

A good resolution can be (maybe) guaranteed by hashing some of the master package.json, mostly just the dep part fed through a deterministic serialization (like json-stringify-deterministic). If it matches we apply it, otherwise we do the big slow recursive gather deposit thing.

Or we can do it the yarn way: store the query in the lock file, and just check that all master package.json dep queries we need have a record in the lock.


P.S. I have a feeling that the subpackages should be merged into this repo using tools like git-subtree. This will make cross-referencing and updating easier for other contributors. The repository field of package.json supports subdirectories now: https://docs.npmjs.com/files/package.json#repository

P.P.S. Maybe a npm with good caching is… just yarn. Maybe with a central symlink thing added it's just pnpm. The only advantage we can have is the ability to be both flat and non-flat I guess.