yarnpkg / yarn

The 1.x line is frozen - features and bugfixes now happen on https://github.com/yarnpkg/berry
https://classic.yarnpkg.com
Other
41.42k stars 2.72k forks source link

yarn probably shouldn't cache packages resolved with a file path #2165

Closed donaldpipowitch closed 7 years ago

donaldpipowitch commented 7 years ago

Do you want to request a feature or report a bug?

I guess a bug.

What is the current behavior? If the current behavior is a bug, please provide the steps to reproduce.

Say you have the following project structure:

component-foo/
└── package.json
└── index.js

yarn-test/
└── package.json

With the following files:

component-foo/package.json:

{
  "name": "component-foo",
  "version": "1.0.0",
  "private": true,
  "main": "index.js"
}

component-foo/index.js:

console.log('foo');

yarn-test/package.json:

{
  "name": "yarn-test",
  "version": "1.0.0",
  "private": true,
  "dependencies": {
    "component-foo": "file:../component-foo"
  }
}

Now you run $ yarn install inside yarn-test/ and yarn-test/node_modules/component-foo/index.js is:

console.log('foo');

Now remove yarn-test/node_modules/ and yarn-test/yarn.lock and change component-foo/index.js to this:

console.log('bar');

Now you run $ yarn install inside yarn-test/ again and yarn-test/node_modules/component-foo/index.js will be:

console.log('foo');

It uses the cached version of component-foo, but component-foo/index.js has changed.

What is the expected behavior?

I'd expect that yarn-test/node_modules/component-foo/index.js should be this at the end:

console.log('bar');

Maybe packages installed with local paths like file:../ shouldn't be cached at all, if we can't figure out, if they have changed.

(FYI: It looks like npm doesn't cache them.)

Please mention your node.js, yarn and operating system version.

$ node -v
v6.9.1

$ yarn -V
0.18.0

macOS 10.12.1
hantuzun commented 7 years ago

This fooled me as well. There must be a way to get around this without clearing all the cache.

I think there could the three ways to make yarn usable for this case:

  1. @donaldpipowitch 's suggestion of ignoring all local dependencies.
  2. Reinstalling local dependencies if a file in there is modified later than when it's cached. This would require us to keep this metadata. For local dependencies we may insert lines like this to "Resolved" column: file://<path>@<cache_timestamp>
  3. Ignoring packages by package names with commands such as yarn cache rm <package> and yarn cache add <package>. For all dependencies.

I'd like to see the second suggestion to be implemented. Unless, the third option could be useful for other cases as well. For instance, yarn cache add <package> can be used to refresh the cache for already cached dependencies in case something could go wrong while downloading a dependency.

satya164 commented 7 years ago

@hantuzun why cache local dependencies at all? they are local anyways, so it will be fast regardless cached or not.

hantuzun commented 7 years ago

@satya164 you're right. Although, I'm more inclined to the third approach would help if a dependency from network is modified intentionally.

satya164 commented 7 years ago

Something like yarn cache ignore <package> will be useful. But I think they don't have to be mutually exclusive. Ignoring a package is useful, but it requires manual work. File dependencies could be made to work without any additional effort if they were ignored by default.

donaldpipowitch commented 7 years ago

Can someone explain me the internal logic?

My understanding: When a dependency with file: is encountered the file-resolver.js kicks in. It says the dependency should be copied and is not hashed. Doesn't no hash mean it should not be cached already...? But the copy-fetcher.js seems to set the hash to an empty string instead of keeping null... Is this problem?

donaldpipowitch commented 7 years ago

@bestander or @kittens Maybe you could explain this a little bit more...? Would love to get a little help to create a PR ♥

bestander commented 7 years ago

Hash means the md5 hash that is used for tarball-fetcher, for example. This hash is then added to yarn.lock file for future verifications and also appended to folder name when saving the unzipped folder in cache. You are digging in the right direction, thanks for looking into this, a PR is very much welcome. You can probably start with a PR that adds a broken unit tests

donaldpipowitch commented 7 years ago

You can probably start with a PR that adds a broken unit tests

Okay, thank you for your response. Should I ping you as a reviewer in the PR or should I ping someone else (or no one)...?

bestander commented 7 years ago

Yeah, ping me

satya164 commented 7 years ago

@bestander probably this issue shouldn't be closed since it's not fixed yet?

donaldpipowitch commented 7 years ago

Yes, it should be re-opened. It was closed because I wrote " fixes #2165" in the title of my PR. In the beginning I thought it would be an ongoing PR, but to fix this bug we want two PRs: the first one added a unit test (the assertion which would fail is not really active, so CI doesn't blow up) and the second one will actually fix it.

bestander commented 7 years ago

Sorry, github closes issues when PR is merged

moimikey commented 7 years ago

so obviously this is still a problem? it's quite annoying to develop with, to be honest. it's causing a bit of a disruption for me on a personal level at work, where i'm making use of file: to create a modular development environment. the sucky part is that every local package that I edit (using the file: path in package.json), requires doing this, in order to pull down the refreshed contents:

editing the contents of my eslint-config-base-eslint package

yarn cache clean && rm -rf node_modules/eslint-config-base-eslint && yarn install --force && yarn lint
bestander commented 7 years ago

Everyone is welcome to contribute to the project. It could be anything - submitting a broken integration test for this case, a fix or convincing someone to work on the fix. If you want to chat about the best way to approach that, you can find help on the discord channel.

bestander commented 7 years ago

I actually think that the fix should be 10-15 lines of code, it will probably save a lot of time fixing it sooner

donaldpipowitch commented 7 years ago

FYI: It could be that this issue is also relevant for slow linking dependencies steps. See my comment here: https://github.com/yarnpkg/yarn/issues/1496#issuecomment-282688818.

Sorry. I couldn't found the time to create another PR for this issue :( I was (and still am) very busy.

mtraynham commented 7 years ago

@bestander This is a pretty big blocker for me, working on a multi-module project. If I'm reading @donaldpipowitch 's code links and comments correctly, could we have the file-resolver create a new hash (instead of null) every time it tries to resolve so it forces a reinstall? Say a UUID or current timestamp? Forgive me if I'm missing something, I'm not familiar with the way the code works.

bestander commented 7 years ago

A new cache with a time stamp and uuid sounds like a reasonable hack. Ideally we should copy the files directly without the cache but it may be a more complex change. Go ahead send a PR

On Tue, 7 Mar 2017 at 03:38, Matt Traynham notifications@github.com wrote:

@bestander https://github.com/bestander This is a pretty big blocker for me, working on a multi-module project. If I'm reading @donaldpipowitch https://github.com/donaldpipowitch 's code links and comments correctly, could we have the file-resolver create a new hash (instead of null) every time it tries to resolve so it forces a reinstall? Say a UUID or current timestamp? Forgive me if I'm missing something, I'm not familiar with the way the code works.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/yarnpkg/yarn/issues/2165#issuecomment-284612526, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBdWDS3xSr8KNu1o9Zn8sA9xdO8pyHOks5rjNEmgaJpZM4LFbmf .

danrot commented 7 years ago

@bestander Regarding your last comment: Wouldn't it make even more sense to symlink it, instead of copying the files? Or is there any reason not to do that?

aikar commented 7 years ago

@danrot windows seems to require admin to symlink. also it messes up recursing down to find node modules

donaldpipowitch commented 7 years ago

Symlink also ignores .npmignore and stuff like that. (Which currently seems to be ignored, too. Which could be problematic: https://github.com/yarnpkg/yarn/issues/1496#issuecomment-282688818)

rhys-e commented 7 years ago

Is there any workaround for this currently barring clearing the entire cache? Unfortunately there seems to be no yarn cache rm package kind-of command.

chetstone commented 7 years ago

@rhys-e I use this small script:

#!/bin/sh
if [ $# != 1 ]
then
   echo Usage $0 package-name
   exit 1
else
    echo Reinstalling $1
fi

dir="node_modules/$1"

if [ -d $dir ]
then
    rm -fr $dir
fi

cache=`yarn cache dir`/npm-${1}*
#        echo $cache
rm -fr $cache && yarn install --force
robhrt7 commented 7 years ago

Have anybody tried just to yarn link all the local dependencies on postinstall? Looks like appropriate solution until proper fix lands.

lucdew commented 7 years ago

I guess the idea is to not having to update the package's version number on every change for the local dependencies. A yarn link would force you to do that, right ? (did not try)

On my side what I finally did is to invoke a script on preInstall phase that compares what is in the source folder and node_modules folder for local dependencies. If it detects any difference, it just removes the cached dependency and removes the integrity file (to force a reinstall). Thus when there's no change the yarn install is pretty fast (there aren't many local dependencies) and if there is the outdated cached version is not used.

robhrt7 commented 7 years ago

@lucdew link will basically create a symlink under the hood, meaning that latest local version will be used. But I think having a script, that before installation cleans the cache of specific deps is the right solution right now.

fungilation commented 7 years ago

I've tested different combinations in working around changes in the local package not being updated in a parent project's installed version under ./node_modules/. I find that just this would do, without need for deleting the ./node_modules/ first:

yarn cache clean; yarn upgrade file:../<package>

Needless to say, manually forcing global cache clean should not be necessary in order to update/upgrade local packages.

charpeni commented 7 years ago

@fungilation As a workaround, you can also use npm to install local dependencies and thus avoid losing the entire cache every time you want an update.

mtraynham commented 7 years ago

Per #2860 and the subsequent merge commit https://github.com/yarnpkg/yarn/commit/7241de13bb236526fa439a2528fbed319f60ef24, you can now "refresh" file: protocol dependencies with

yarn install --force

Edit Or upgrade the particular package (didn't realize this worked as well). If the dependencies version didn't change, this doesn't modify the lockfile, but it will still copy the newer version down.

yarn upgrade [file protocol package name]

The PR change will invalidate the dependency in the cache & reinstall it locally. yarn install will also work if the dependencies version changed, which invalidates the yarn.lock file. You no longer need to clear the cache, nor delete the module from the local install.

It was also made apparent to me that there is an active RFC for linking dependencies, possibly with a link: protocol. That can be followed here, https://github.com/yarnpkg/rfcs/pull/34.

donaldpipowitch commented 7 years ago

@mtraynham Thank you for your pull request! 👌 This is awesome. Any reasons why --force is needed? I currently don't even know what it does exactly right now without looking it up :) Just asking, because npm doesn't need a --force flag and it would have been nice to behave like npm.

mtraynham commented 7 years ago

Edit Looks like yarn upgrade [dependency] works as well. I want to point out, this does not always change the lockfile, which should only reflect package.json version changes. I've updated my original post, as an upgrade is probably more appropriate.


The short version is, Yarn won't do anything with cache-resolvers if the lockfile didn't change, so we need to skip the lockfile check and ask the cache if there's a newer version. That can be done using upgrade or install --force

Per yarn install --force documentation

"it refetches all packages, even ones that were previously installed".

This really means it will skip the lockfile integrity check. The lockfile integrity check will usually pass if you don't change the package.json file and bail out gracefully. Skipping it asks the cache to check for missing/mismatched dependencies against the lockfile, download them if missing, and copy back any newer/missing dependencies locally. Then it runs npm install/postInstall scripts.

The PR change will now invalidate the file: dependency in the cache and copy the new version down locally. Prior to this, it would never invalidate the file: dependency. For other protocols, if you didn't change your package.json file, they will go unchanged (in the cache and locally).

So what, does this mean for us? I have about 60 dependencies on a project (ranging from Angular to Webpack), with one file: dependency. On a second install --force, where I only want to refresh the local dependency, it takes about ~5 seconds (up from ~1.5 seconds for yarn install). To me this is pretty negligible, and I'm actually very impressed by how little work yarn tries to perform throughout the whole process.

If there is another CLI command to skip the lockfile check and check the cache for only the particular file dependency, that would probably be even faster, but I haven't found one. Actually yarn upgrade might work, I just haven't tried it.


All that being said, I'd still call this a band-aid. This could be replaced by a better solution such as link:; I don't think anyone really cares to cache local dependencies. At the very least, the added overhead of using install --force or upgrade is mostly negligent and you no longer have to yarn cache clean; mv node_modules /tmp/.

donaldpipowitch commented 7 years ago

Great answer. 👏 Thank you for writing this down.

jonschlinkert commented 7 years ago

Does yarn overwrite local symlinked project files with files from yarn's cache? (because it looks like that's what's happening)

donaldpipowitch commented 7 years ago

The PR change will now invalidate the file: dependency in the cache and copy the new version down locally.

Does this means that whenever I call $ yarn inside a package which has "foo": "file:../" as a dependency a new copy of "file:../" will be created?

E.g. I have a package with multiple examples which are also packages:

foo/
foo/examples/
foo/examples/example-1/
foo/examples/example-2/
foo/examples/example-3/
...
foo/examples/example-10/

And it looks I have foo 10 times now in the yarn cache. I also test my examples on every version change of foo (and I have multiple modules configured in that way, not just foo) so it looks like my yarn cache grows really quick currently.

Is this the correct behaviour?

bestander commented 7 years ago

I think it is a better alternative than having a stale version in your cache. With yarn 0.26 you can use link: to create symlinks instead of copying files. Also we are working on workspaces that will automate the symlink creation, https://github.com/yarnpkg/yarn/issues/3294

donaldpipowitch commented 7 years ago

Yeah, looking forward to workpaces 👍

link: does not yet work with npm, right? (Because https://github.com/npm/npm/pull/15900 is still open.)

eTorAken commented 7 years ago

From npm 5 patchnote, files are now automatically symlinked with the file: syntax:

npm install ./packages/subdir will now create a symlink instead of a regular installation. file://path/to/tarball.tgz will not change – only directories are symlinked. (#15900)

So yeah, there is no link with npm.

satya164 commented 7 years ago

npm install ./packages/subdir will now create a symlink instead of a regular installation

Kinda unfortunate. File deps never behaved the same due to copying everything (including node_modules) and not respecting .npmignore or files field. Now it's worse with symlink.

bestander commented 7 years ago

I think file: and link: could be improved further, there are different strategies that have their own PROs and CONs and Yarn should allow people to choose one. For example knit RFC could be implemented as one of the strategies https://github.com/yarnpkg/rfcs/blob/master/accepted/0000-yarn-knit.md

line0 commented 7 years ago

@bestander

I think it is a better alternative than having a stale version in your cache.

Or so you'd believe until you run out of disk space and find that your Yarn cache is using tens of gigabytes for millions of utterly useless files. IMO that makes the whole thing more seriously broken than it was before, because you find out only once you've temporarily broken your development system.

nikdojo commented 6 years ago

Hi guys, any updates on the issue? It's really a huge problem for us, especially when we're working on multiple products at once that depend on same linked multi-dependencies. Gigabytes of cache in a day of development, etc. Could you maybe at least make it optional and disable caching for such packages?

mtraynham commented 6 years ago

@nikdojo Use the yarn link: protocol for dependencies instead of file:, it uses symlinks. It was introduced in 0.26.

Alternatively, start using workspaces if you have many inter-project dependencies.

nikdojo commented 6 years ago

@mtraynham Thx for the hint, I tried to find any info abt link: protocol in official docs, but seems it's not there. Experimenting with workspaces now.

satya164 commented 6 years ago

@bestander btw as you know, react-native doesn't support symlinks, so if we work with rn libs, it's still a big issue.

hoegge commented 6 years ago

This was never solved was it? You have to use linklocal (NPM packages) to link local packages (which should be the standard way yarn operates when consuming packages from local filesystem - using junctions or symlinks on Windows instead of caching). A new yarn install then overwrites everything with old stuff from the cache and you have to start over linking again.

fungilation commented 6 years ago

Can we be less architecture astronaut and simply don't cache local packages? This issue is now 1.5 years old, and I'm tired of running yarn add ../<another-local-package> every time I change something in another-local-package.

hoegge commented 6 years ago

Hi @fungilation I've opened another related issue: #6037

kselax commented 6 years ago

doesn't work I've put in file App.js console.log('here we are'), and it didn't output. then remove all files and it still makes the output from the cache. How to avoid this?

gregjacobs commented 5 years ago

Yarn has really failed me on this issue. It's been great for everything else, except for this.

I've been trying to install a new version of a local package for testing purposes, and I keep ending up with the old package no matter what I do.

I've tried:

  1. Cleaning the yarn cache (yarn cache clean package-name)
  2. yarn add'ing with --force
  3. Removing node_modules/package-name and yarn adding again
  4. Updating the version number and filename of the local package (yarn still installs the contents of the old version, even though it reports installing the newer one)
  5. Combinations of all of the above

This is ridiculous, and I've spent almost 4 hours of my day on this.

We need the ability to develop and reinstall local packages. I am relying on yarn to install a file in the .bin folder, so yarn link is out of the question.

@ Yarn Developers: Are you able to install a local package, make changes to that local package, and then get the changes to be reflected by installing again?

jonathantorley commented 5 years ago

@gregjacobs I had success with yarn install --force