make `--pure-lockfile` default for `install`

bestander commented 8 years ago

Do you want to request a feature or report a bug?

feature

What is the current behavior?

Not passing --pure-lockfile for install command confuses me because it modifies the lock file while installing node_modules. We agreed on semantics that add/upgrade/remove are to change dependencies and install is to consistently rebuild node_modules from lockfile.

Consistency gets lost when lockfile is modified depending on environment (version of yarn currently installed).

What is the expected behavior?

Not write yarn.lock or package.json when doing yarn install. To update yarn.lock use yarn upgrade

Please mention your node.js, yarn and operating system version.

yarn 0.14

eliihen commented 8 years ago

I agree. There should be a discussion on why yarn install writes a lockfile by default on the first place, as it seems to be at odds with the entire lockfile concept. Why have a lockfile if it is not locking versions by default?

There is a case for yarn install creating a lockfile if none is present, i.e. when someone is converting a project to use yarn, but the rationale for always writing it is not clear. I agree with @bestander's opinion that only mutating actions should update the lockfile by default, i.e. add/upgrade/remove.

ide commented 8 years ago

Should there be a way to modify the lockfile without add/remove/upgrade ex: in the scenario when you upgrade Yarn and it uses a new lockfile version?

bestander commented 8 years ago

I suppose the option could be inveresed

yarn install --save-lockfile

On 17 October 2016 at 18:53, James Ide notifications@github.com wrote:

Should there be a way to modify the lockfile without add/remove/upgrade ex: in the scenario when you upgrade Yarn and it uses a new lockfile version?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yarnpkg/yarn/issues/570#issuecomment-254282256, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBdWJLpdvqiwcBwqE4KB3x3f4oCn_nVks5q07YYgaJpZM4KSlSw .

fingermark commented 8 years ago

I am also confused by this. What is the reasoning for the current default behavior?

bestander commented 8 years ago

Afaik, there was no strong reasons for the default behavior. The idea, I suppose, is to keep people's lockfiles "evergreen".

bestander commented 8 years ago

BTW PR is welcome

jamiebuilds commented 8 years ago

I think the reason for it was that yarn was originally designed with a single install command which was split into install/add/upgrade

Guuz commented 8 years ago

So, as to check if i understand this correctly:

yarn installs all dependencies and also modifies the lockfile. On a CI server you should use yarn install --pure-lockfile? Why does the lockfile get modified during an install? Since you are not upgrading anything... Yarn should just install the packages as described in the lockfile, right?

Thanks for the explanation!

sebmck commented 8 years ago

The problem is that if the lockfile is pure by default then people are going to forget to update it since it'd be a separate command.

idris commented 8 years ago

@kittens Shouldn't the lock file only be updated upon add/remove/upgrade of any packages? Those should always upgrade the lock file, as well as an initial install.

eliihen commented 8 years ago

The problem is that if the lockfile is pure by default then people are going to forget to update it since it'd be a separate command

That being a problem depends on what you consider to be the main objective of a package manager.

In my opinion, one of the roles a package manager fills is make it as easy as possible to get started with developing on a project. A simple yarn install should get all the packages you need to start developing, without any confusion involved.

With npm I've had many instances of developers join a project, only to find a project does not work on their machine. These instances have occurred due to transient dependencies bumping versions to versions with breaking changes or simply not following semver. I had hoped yarn would solve these issues, but if the takeaway is that all developers on a project should run yarn install --pure-lockfile to be 100% sure that the project is going to build, then that is not the case.

Another role of a package manager is giving projects control of their dependencies. If it is made pure by default, developers are able to have a look at yarn outdated to see the outdated versions and then review the change notes, avoiding any breaking changes. This would give developers full control to only bump versions in a given release timeframe instead of banning developers from doing git commit -a to avoid accidental lockfile commits.

fredva commented 8 years ago

I agree with everything @esphen says. I am surprised the pure behavior is not the default in Yarn – I thought this kind of consistency was the major benefit Yarn had over NPM. This should be the most compelling reason for switching from NPM the way I see it.

adamchainz commented 8 years ago

Totally surprised us by breaking build after we started using yarn for a few days. I honestly thought --pure-lockfile was the default behaviour after reading much of the documentation and about how it's better than npm with shrinkwrap. Please make default :)

sebmck commented 8 years ago

@ide Imagine a scenario where someone is just using npm and updates package.json, how is yarn.lock going to be updated?

sebmck commented 8 years ago

Can someone please write up the scenarios that lead to the lockfile being modified unexpectedly? This change is a serious one and does make the lockfile a second class citizen by requiring updates to it to be explicit which means a lot of overhead in remembering what operations result in it being updated etc.

adamchainz commented 8 years ago

More info on the above: our build has coffeescript from Github as a subdependency. coffeescript pushed some commits and we got a modified yarn.lock in our build process from running just yarn install:

diff --git a/foo/yarn.lock b/foo/yarn.lock
index ec667fa..bb1f6ae 100644
--- a/foo/yarn.lock
+++ b/foo/yarn.lock
@@ -930,9 +930,9 @@ code-point-at@^1.0.0:
   version "1.6.3"
   resolved "https://registry.yarnpkg.com/coffee-script/-/coffee-script-1.6.3.tgz#6355d32cf1b04cdff6b484e5e711782b2f0c39be"

-"coffee-script@github:jashkenas/coffeescript":
+coffee-script@jashkenas/coffeescript:
   version "1.11.1"
-  resolved "https://codeload.github.com/jashkenas/coffeescript/tar.gz/887052de079b2f0af9f080031a00bb7544eaca08"
+  resolved "https://codeload.github.com/jashkenas/coffeescript/tar.gz/0d132318ce8f7116a436d97db1f2a5c8b1dedf28"

 colors@0.3.0:
   version "0.3.0"

bestander commented 8 years ago

Can someone please write up the scenarios that lead to the lockfile being modified unexpectedly? This change is a serious one and does make the lockfile a second class citizen by requiring updates to it to be explicit which means a lot of overhead in remembering what operations result in it being updated etc.

I perceive yarn install as a command that builds node_modules for me. It is opposite to yarn add and yarn remove that modify package.json, yarn.lock and cleanup node_modules. Opposed to add and remove I run install 100 times more often especially in CI where I never expect to have side effects.

Examples when I don't expect things changing:

I am on Yarn 0.15.0, my team mates are on Yarn 0.16.0. Because 0.16.0 added spaces between entries in yarn.lock every time I run yarn install against yarn.lock that was generated by my team mates I get a modified yarn.lock file that I need to remember not to commit. And vice versa.
My other build tools depend on yarn.lock as "the source of truth" of node_modules state. If it changes unexpectedly I will get non determinism in my builds

101100 commented 8 years ago

@kittens

The problem is that if the lockfile is pure by default then people are going to forget to update it since it'd be a separate command.

Imagine a scenario where someone is just using npm and updates package.json, how is yarn.lock going to be updated?

If we assume that yarn install should not update yarn.lock, then it should also fail if yarn.lock is out of sync with package.json to highlight the fact that a yarn install --save-lockfile is needed to bring everything back in sync.

jasonLaster commented 8 years ago

+1 yarn install should not mutate the yarn.lock

The debugger is an oss app. We want contributors to be able to yarn install and get the good versions. We've had people npm install and say it's breaking because of transitive properties . With yarn install, contributors yarn install and don't know what to do with the yarn lock changes.

I'm not worried about updating the lock file. Ideally greenkeeper would do this when deps change and we could merge the lock file change then.

jamiebuilds commented 8 years ago

I want to update this issue with the current thoughts about it. @kittens and I both think that --pure-lockfile should not be the default for a couple of reasons.

It starts with how people add/remove/update dependencies. While there are commands for it, it is common practice to manually update the package.json either by hand or by another tool like Lerna.

Once you have manually modified the package.json the expectation in both Yarn and npm is that when you run another install it syncs with the package.json. In that sense yarn install could almost be renamed yarn sync.

On the topic of syncing, when you run an install with new dependencies you expect the node_modules directory to reflect those changes. Since yarn.lock acts as an assistant to node_modules you should expect it to stay in sync the same way.

Your package.json is the ultimate source of truth, that is your interface to yarn, it's your configuration and it's the only thing you should ever be concerned with. In an ideal world you simply commit your yarn.lock and then never have to think about it again.

On a side note I believe many people who are voicing support for this issue are confused about what's actually being discussed here.

Using --pure-lockfile by default does not mean that Yarn does not produce consistent and reliable results. The same package.json will result in the same yarn.lock which will result in the same node_modules 100% of the time.

When you update your package.json your yarn.lock file updates and then your node_modules updates. That is a very natural order to things and we should keep it that way

In regards to CI being able to get different dependencies when you have updated your package.json but have not run yarn install to sync everything which I'm sure someone will bring up (although I do not see as an issue)-- myself and others have been speaking to various CI tools about integrating Yarn, we can easily push for them to use --pure-lockfile by default if people see it as a big issue.

If we were to make this change it would have a negative impact far more often when changing dependencies. For the reasons I've listed I say we should close this issue.

Pauan commented 8 years ago

@thejameskyle I would appreciate it if you could clarify something:

A developer has a package.json which contains a dependency "foo": "^1.0.0"
The developer runs yarn install. The foo package is currently version 1.0.0, so it creates a yarn.lock file which locks in foo@1.0.0
The developer adds yarn.lock to Git.
The developer runs unit tests on their local copy of the repo, everything works fine.
The developer pushes their repo to CI (e.g. Travis).
CI runs yarn install, but foo has now updated to version 1.1.0, so Yarn installs foo@1.1.0 and overwrites yarn.lock with the new version of foo
The CI breaks because foo had a breaking change in version 1.1.0

Here's a similar situation:

A developer has a package.json which contains a dependency "foo": "^1.0.0", which is locked in as foo@1.0.0, and yarn.lock is saved in Git.
Unit tests work fine on the developer's local copy of the repo.
A contributor clones the repo with the intent of making a modification + pull request.
When the contributor runs yarn install they get version foo@1.1.0 which causes yarn.lock to be updated.
Now the contributor's build is broken because foo had a breaking change in version 1.1.0

I think those are the kind of situations that most people are worried about.

So if you could clarify that the current behavior of yarn install does not have the above problems, I think that would remove most of our fears. :+1:

jamiebuilds commented 8 years ago

Neither of those situations apply. Just because a dependency has updated doesn't mean you'll get it, only if you've made changes to your package.json.

jamiebuilds commented 8 years ago

I'm just going to close this issue because it really seems like this is the only concern people have, which as I said is not a real scenario. This issue is likely causing more confusion.

adamchainz commented 8 years ago

But it does have the bad behaviour if a dependency is being installed from github, as I reported above

jamiebuilds commented 8 years ago

@adamchainz That should be fixed separately, we can easily lock it to the commit

eliihen commented 8 years ago

Neither of those situations apply. Just because a dependency has updated doesn't mean you'll get it, only if you've made changes to your package.json.

@thejameskyle: I'm not sure I understand why this is not a real scenario. Could you please elaborate?

jamiebuilds commented 8 years ago

Imagine a memoize function where the input is a package.json and the output is the yarn.lock.

The first time you pass a package.json it creates a yarn.lock and caches the result.
The next time you run that same package.json the result will be exactly the same because it is cached.
When you change the package.json you've invalidated the cache and now the yarn.lock will be recalculated.

What we're talking about right now is getting rid of # 3 and instead treating yarn.lock as if it has not been invalidated by the changed package.json. Which would be really weird for a memoize function to have and would be a really weird behavior for Yarn to have.

What happens to a package in terms of commits and new versions should be irrelevant (if we have a bug with git commits then we should fix that separately, but it is unrelated to this issue).

It's more complex than I've made it out to be (each package version gets effectively "memoized" individually, changing the version of one package doesn't invalidate the rest), but hopefully now everyone gets the point.

bharley commented 8 years ago

@thejameskyle: For the sake of clarity (and curiosity), let's say I have a project with a yarn.lock file, and someone pulls down the repository. Without running yarn install or npm install, this person adds a new dependency to the package.json file and then runs yarn install. Will the existing yarn.lock file be completely disregarded in this case?

wycats commented 8 years ago

There's a bunch of different things going on here that I wanted to try to unravel (no pun intended).

First, people have raised a number of different requirements that I think are uncontroversial (and which make some of the existing behaviors bugs, which I'll get to soon).

From the original bug report.

Consistency gets lost when lockfile is modified depending on environment (version of yarn currently installed). What is the expected behavior? Not write yarn.lock or package.json when doing yarn install. To update yarn.lock use yarn upgrade

To be more precise, the expected semantics, in my opinion, are:

if package.json has not changed since the last time yarn.lock changed, yarn.lock is the source of truth and should not be updated.
if package.json has changed since the last time, yarn.lock changed, update yarn.lock so it satisfies package.json and update node_modules.
if yarn update is run, re-resolve all dependencies and get the latest version of everything that satisfies the package.json.

This means that when a repository is first cloned on a machine, if the yarn.lock was checked in, yarn should always treat it as the source of truth, not generate updates to the yarn.lock, and jump directly to the fetch step.

To the extent that this is not the current behavior of yarn, I believe that would be a bug.

@esphen wrote:

I agree. There should be a discussion on why yarn install writes a lockfile by default on the first place, as it seems to be at odds with the entire lockfile concept. Why have a lockfile if it is not locking versions by default?

I think what this is trying to say is that yarn should not write out a new lockfile if the existing one is still up to date. I agree with that.

I agree with @bestander's opinion that only mutating actions should update the lockfile by default, i.e. add/upgrade/remove.

The main niggle here is whether a change to the package.json should cause the yarn.lock to become updated. In my opinion, if the change to the package.json is not satisfied by the yarn.lock, it must update the yarn.lock.

An important invariant of lockfile systems like yarn is that, using the normal workflow, developers can be sure that the packages that actually get used when they run their app match the versions specified in their package.json. If the package.json is allowed to go out of sync with the yarn.lock, this will not be true, and the only way to know that will be for human readers to carefully read the yarn.lock.

The best way for most users to think about the lockfile is that it's an artifact of the running yarn that represents the precise versions of all packages that were used for the current package.json. By checking it in, other collaborators, CI, and production code are assured to use those same versions.

@Guuz said:

So, as to check if i understand this correctly: yarn installs all dependencies and also modifies the lockfile. On a CI server you should use yarn install --pure-lockfile?

This question echoes a sentiment a few people have made in this thread.

Cargo has a --locked flag which says "if the package.json is not satisfied by the yarn.lock, it's a hard error". Bundler has a similar flag (--frozen), which was added when Heroku adopted Bundler, to give people a hard error if they made local changes to their Gemfile and forgot to check in the Gemfile.lock.

The idea is that during your normal development, you would like to be able to make changes to the package.json and have yarn ensure that the yarn.lock stays in sync (again, to ensure that the versions specified in package.json always match what gets used in practice).

But when deploying, it's virtually always an error to have diverged, because it means you made a change to package.json, ran a local yarn command, and forgot to check in yarn.lock. This means that the versions in your package.json do not match the actual versions used when the application is run, which we said violates a fundamental invariant of yarn.

@esphen said:

In my opinion, one of the roles a package manager fills is make it as easy as possible to get started with developing on a project. A simple yarn install should get all the packages you need to start developing, without any confusion involved.

I think this is uncontroversial.

With npm I've had many instances of developers join a project, only to find a project does not work on their machine. These instances have occurred due to transient dependencies bumping versions to versions with breaking changes or simply not following semver. I had hoped yarn would solve these issues, but if the takeaway is that all developers on a project should run yarn install --pure-lockfile to be 100% sure that the project is going to build, then that is not the case.

Running yarn install --pure-lockfile will mean that the lockfile will be respected even if versions inside the lockfile conflict with versions specified in the package.json. This should only arise in the first place if a developer forgets to check in their yarn.lock after making changes to the package.json.

Another role of a package manager is giving projects control of their dependencies. If it is made pure by default, developers are able to have a look at yarn outdated to see the outdated versions and then review the change notes, avoiding any breaking changes. This would give developers full control to only bump versions in a given release timeframe instead of banning developers from doing git commit -a to avoid accidental lockfile commits.

If the package.json hasn't changed, in my opinion it's a bug if yarn.lock is getting updated. At least one case of the bug seems to be in the original report:

lockfile is modified depending on environment (version of yarn currently installed).

I think this is a mistake and should be corrected.

Later in the thread, @thejameskyle said:

Imagine a memoize function where the input is a package.json and the output is the yarn.lock.

That's exactly the right mental model, in my view ("yarn.lock can change if and only if package.json changed"), and if the abstraction leaks we should fix it.

@adamchainz said:

More info on the above: our build has coffeescript from Github as a subdependency. coffeescript pushed some commits and we got a modified yarn.lock in our build process from running just yarn install

and later:

But it does have the bad behaviour if a dependency is being installed from github, as I reported above

The problem here is that yarn doesn't treat the git sha as part of the locked version of git dependencies. Cargo and Bundler both have the concept of a "precise" version which is serialized to the lockfile; for git sources, the "precise" version is the SHA. Then, when you make a fresh clone with just a package.json and yarn.lock and run yarn, all of the information needed to get precisely the code you need is there.

I must confess that I missed this interaction when reviewing the original git code; there is some SHA tracking in the code, but yarn install doesn't ensure that the hydrated dependency graph respects it.

TL;DR

I agree with @thejameskyle and @kittens that yarn.lock should be kept in sync with package.json automatically, because I believe that users should be able to assume that versions specified in their package.json line up with what is used when their app is executed.

However, there appear to be a few bugs that are causing inappropriate churn in the yarn.lock even when the package.json has not changed:

changes to the yarn version across machines updates the lockfile
git dependencies get updated even if package.json hasn't updated, which then updates the lockfile

We should also consider something like Cargo's --locked flag, which you can use in CI to fast-fail the build if a developer updates the package.json and forgets to check in the updated yarn.lock.

Pauan commented 8 years ago

@thejameskyle Thanks! :heart: I agree with you and @kittens that yarn.lock should be updated after changing package.json

@wycats A very thorough and insightful post as usual. :+1: I agree with you, and I also like the idea of a --locked flag (or similar). We should create a new issue about that.

adamchainz commented 8 years ago

Made #1568 to track the git SHA issue

bestander commented 8 years ago

@wycats, thanks for the unraveling, very insightful overview!

This means that when a repository is first cloned on a machine, if the yarn.lock was checked in, yarn should always treat it as the source of truth, not generate updates to the yarn.lock, and jump directly to the fetch step. To the extent that this is not the current behavior of yarn, I believe that would be a bug.

That is exactly the scenario why this issue has been opened. We have a few active versions of Yarn in the company and at our scale I don't think we will be able to make atomic updates everywhere. Builds on yarn 0.13, 0.14 and 0.15 introduced slight variations in yarn.lock files even though package.json was in sync. This caused a few issues, for example Buck builds were slowed down because changes in source tree invalidate caches. This caused me and a couple of teams a few hours of work.

@thejameskyle, thanks for sharing your opinion. I did not consider the scenario of package.json being out of sync with yarn.lock, to be fair. And you have a valid point.

However as @wycats pointed out, the original bug report is valid. Fixing this is important to have valid builds and I will reopen the issue with the intent to come up with a solution that satisfies all interested parties.

sebmck commented 8 years ago

@wycats

To be more precise, the expected semantics, in my opinion, are:

if package.json has not changed since the last time yarn.lock changed, yarn.lock is the source of truth and should not be updated.

if package.json has changed since the last time, yarn.lock changed, update yarn.lock so it satisfies package.json and update node_modules.

if yarn update is run, re-resolve all dependencies and get the latest version of everything that satisfies the package.json.

This means that when a repository is first cloned on a machine, if the yarn.lock was checked in, yarn should always treat it as the source of truth, not generate updates to the yarn.lock, and jump directly to the fetch step.

These are the semantics we follow that I added in #364.

@bestander You were involved in the PR (#364) that put these heuristics in to place. What additional changes are you proposing?

sebmck commented 8 years ago

This issue is extremely broad and we've already agreed that --pure-lockfile wont be the default and we'll follow the heuristics outlined by @wycats. If this issue is to be remained open then the title needs to reflect the current issue with this behaviour.

bestander commented 8 years ago

@kittens sounds good, I'll update the issue. Or maybe I should open a new one related to install changing the lockfile when package.json did not change

jamiebuilds commented 8 years ago

Can we move to a new issue? This comments here can just be preserved as an archive

bestander commented 8 years ago

Sounds good, @thejameskyle, I'll create a new issue today and link here

bestander commented 8 years ago

Created the new focused issue https://github.com/yarnpkg/yarn/issues/1576

benjamine commented 7 years ago

it would be interesting to have an option to make yarn install fail if package in package.json is not in yarn.lock, ie. fail if any package is not locked

CrabDude commented 7 years ago

Adding clarification that was still ambiguous to me after reading the above:

tldr; When running yarn install, a dependency's yarn.lock entry is altered only when it's associated package.json entry is altered; Unrelated changes in package.json will not update a package to the latest version compatible with its unchanged package.json semver.

Based on some of the wording above, it sounded like yarn.lock was cached keyed on a package.json hash, and thus it sounded like yarn.lock would be written to (updated / cache invalidated) on any change to package.json, which would be problematic since an unrelated change (i.e., update to "description" or another dependency) might cause that dependency's yarn.lock version to be updated to a newer version within the same existing package.json semver.

However, I verified that a package's yarn.lock entry is only written to when it's corresponding package.json semver is updated (even if the new semver is compatible with the existing yarn.lock version, and consequentially would not otherwise necessitate a version update).

For example,

Say yarn add lodash@^4.17.1 installs lodash@4.17.2
Later, lodash@4.17.4 is available.
yarn will continue to install lodash@4.17.2
Unless/Until lodash's version is altered in package.json (or yarn add/upgrade/remove is run specifically against lodash).

Breadcrumb #1576

bestander commented 7 years ago

BTW if you are willing contribute to the docs with small articles like this that would be great for the community. The core team is all busy fixing issues and adding new features and it is expected and appreciated if the community helps keeping up the documentation

rarkins commented 7 years ago

@CrabDude thanks for sharing your clarification.

Do you mean - in your above example - that only lodash and its own dependencies will have their lock versions updated in yarn.lock? e.g. even if another dependency could have a new lock version then it won't get updated at the same time?

Or a second example: Let's say a yarn.lock is severely outdated, and the user runs yarn add to add a new dependency to package.json. Will all the other outdated packages now be updated in yarn.lock, or will they remain the same?

CrabDude commented 7 years ago

@rarkins

Do you mean - in your above example - that only lodash and its own dependencies will have their lock versions updated in yarn.lock?

Yes. This seems to be upheld in my example.

Will all the other outdated packages now be updated in yarn.lock, or will they remain the same?

It would seem the non-lodash dependency trees / lock entries of packages would not be updated; only lodash's sub-dependencies would be.

From my perspective, each of these is both desirable and expected.

jspiro commented 7 years ago

Preface: I love yarn. But it frustrates me to no end.

At my company, yarn install changes the lockfile constantly across different machines (each running the same version), despite never changing package.json. And when we do, we update using yarn add. This is annoying because CI verifies that git status is clean after a build to be sure we didn't forget to do things like check in a lock file, and it frequently changes.

My expectation of yarn was that it would ensure identical node_modules across all machines by default. Not with extra flags. It would prioritize correctness over convenience. If I wanted uncertainty I could use npm directly. When a file changes, it's a signal to me that something has changed it and I should scrutinize it. It shouldn't change.

Questions

Is it being said that despite the lockfile being changed, the contents of node_modules will always be identical to when it was generated? I don't believe this is the case but if it is, then I understand the confusion in this thread -- it would mean that yarn does the right thing despite the appearance that it does not.
When package.json changes, the lockfile is regenerated. Couldn't that unintentionally change a lot of dependencies depending on the state of that particular programmer's node_modules? Yarn should determine a delta and try to preserve existing locks as best as it can (if it doesn't already).
Why does yarn add specify versions in package.json with a ^? Again, I understood yarn's promise was to freeze dependencies.

Related Bugs

When a random package is deleted in node_modules, yarn install says success without reinstalling it. When a lot of them are gone, it reinstalls them. npm is a bit more thorough in this regard.
The lockfile tends to get regenerated if you delete node_modules and do a clean install (which is literally the opposite of what you would expect -- I expect it to install exactly what's in the lockfile and do absolutely nothing else)
If you delete the lockfile without touching package or node_modules after a clean install, yarn regenerates it and its usually very different than the previous version. This is like a compiler producing different code each time you run it despite changing nothing.

Overall, yarn makes installs faster, but seems to fail at its core (in)competency: Freezing versions, consistently, by default. I don't need conveniences that help me get my project started, I need help maintaining it over a huge team over many years. Programmers are smart and intentional, when they want a change, they'll ask explicitly.

The constantly changing lockfile doesn't instill confidence and is a constant hassle. I'd prefer warnings and errors that package.json doesn't match the lockfile, that the lockfile doesn't match node_modules, that a locked version no longer exists, etc., so it stops my builds dead in their tracks and I can make intentional decisions about my dependencies.

bestander commented 7 years ago

@jspiro, thanks for writing this up. There are a few issues raised here. It would be better to open each issue separately otherwise they will be lost in the comments.

Are you on latest version of Yarn? As of 0.18-0.19 we don't see modifications to yarn.lock files between machines.

Questions:

Is it being said that despite the lockfile being changed, the contents of node_modules will always be identical to when it was generated? I don't believe this is the case but if it is, then I understand the confusion in this thread -- it would mean that yarn does the right thing despite the appearance that it does not.

Dev and optional dependencies can be left out for the same lockfile. But the ones that are bing installed, except for platform specific packages, node_modules should have identical packages in identical places.

When package.json changes, the lockfile is regenerated. Couldn't that unintentionally change a lot of dependencies depending on the state of that particular programmer's node_modules? Yarn should determine a delta and try to preserve existing locks as best as it can (if it doesn't already).

That is a nice feature request, would love to see a PR for that.

Why does yarn add specify versions in package.json with a ^? Again, I understood yarn's promise was to freeze dependencies.

That reflects npm's behavior. You can do yarn add left-pad@1.0.1 or yarn add is-array --exact for exact version. Maybe at some point we should make exact versions default, this can be a discussion in an RFC.

When a random package is deleted in node_modules, yarn install says success without reinstalling it. When a lot of them are gone, it reinstalls them. npm is a bit more thorough in this regard.

Yarn runs a quick shallow check by default. Doing a deeper check will be slower but we are working on it, I have an idea how we could do a quick deep check. You are not supposed to touch files in node_modules though, verifying each file for modification would result in a very slow install experience. If you want to skip shallow check then remove node_modules/.yarn-integrity file before installation. This is not official and subject to change. An official way is to run yarn install --force, it would force full install but it would rewrite yarn.lock as a side effect.

The lockfile tends to get regenerated if you delete node_modules and do a clean install (which is literally the opposite of what you would expect -- I expect it to install exactly what's in the lockfile and do absolutely nothing else)

Haven't seen this for a while. Open an issue and cc me if this can be reproduced.

If you delete the lockfile without touching package or node_modules after a clean install, yarn regenerates it and its usually very different than the previous version. This is like a compiler producing different code each time you run it despite changing nothing.

After some time new versions of transitive dependencies might have been released. Because of that the structure of node_modules can change significantly because of the hoisting logic. That works as designed. There is import command coming https://github.com/yarnpkg/yarn/pull/2580. That would allow you generating a lockfile from existing node_modules.

@jspiro, Yarn is a young community driven project, your PRs to make it better work for you are welcome.

FezVrasta commented 7 years ago

Any chance to get at least an option to set the desired default behavior?

bestander commented 7 years ago

Currently we are fixing this issue https://github.com/yarnpkg/yarn/issues/3490, sometimes yarn install may cause lockfile to be optimized which is not expected behavior and we will fix it. That might be the reason why you are asking for this change, otherwise yarn.lock file should change only if you make changes to package.json manually.

You can set --pure-lockfile/--frozen-lockfile to true in .yarnrc and it will be appended to the install command by default:

--install.pure-lockfile true

FezVrasta commented 7 years ago

My problem is that if I don't use pure-lockfile I get the wrong version of the dependencies installed. It's not related the unwanted yarn.lock changes

bestander commented 7 years ago

Can you submit an issue with repro steps? We'll sort it out

k0pernikus commented 7 years ago

I was bitten by this as well when a package.json and yarn.lock got out of sync due to a developer mistakenly adding a dependency via npm install --save instead of yarn add.

I disagree though that pure-lockfile should be the default and argue that rather frozen-lockfile should be the default for yarn install.

As frozen-lockfile yields an error message if the yarn.lock and package.json are out of sync. The frozen-lockfile is therefore very helpful on a build machine (i.e. jenkins) as it will mark those build, as should be expected, as a failure.

It's then up to the developer to decide which version to add in the package.json / yarn.lock.

The unfortunate default of yarn install will just fetch the most current version of the not-yet locked in dependencies, and write an updated version yarn.lock, which will never be part of the project. Therefore allowing future breaks of the build due to a unexpected version bump. That's the very reason we have a lockfile to begin with.

The gist should be though:

Only commands like add, remove, and upgrade should mutate the yarn.lock.

install should just do that, namely either install the dependencies in their locked in version or fail if it detects a mismatch between the package.json and yarn.lock. (The only exception being if there's no yarn.lock in the first place. Then, and only then, it may create one, yet it never, ever should touch it again.)

yarnpkg / yarn

make `--pure-lockfile` default for `install` #570