yarnpkg / berry

📦🐈 Active development trunk for Yarn ⚒
https://yarnpkg.com
BSD 2-Clause "Simplified" License
7.25k stars 1.08k forks source link

[Bug]: Yarn does not dedupe in-range compatible versions of dependency in child workspace #5948

Open trusktr opened 8 months ago

trusktr commented 8 months ago

Self-service

Describe the bug

I have multiple packages that depend on solid-js in a child workspace. After yarn install at the top of the root, I see the following inside a child workspace:

❯ find . | grep -E "solid-js$"
./node_modules/solid-js
./node_modules/@lume/variable/node_modules/solid-js
./node_modules/@lume/element/node_modules/solid-js

The child workspace has "solid-js": "1.4.x" in package.json, while the dependencies that depend on solid-js have "solid-js": "^1.0.0" in their package.json files, which means that all three version ranges for solid-js are intersecting.

Because the ranges intersect, I expect there to be a single version of solid-js inside node_modules of the child workspace.

Here are the versions I see get installed:

❯ find . | grep -E "solid-js$" | xargs -I F sh -c 'cat F/package.json | grep version'
  "version": "1.4.8",
  "version": "1.8.5",
  "version": "1.8.5",

Because of this issue, I tried to see if there's a way I can fix it using resolutions, with no luck.

I tried putting this in the child workspace:

    "resolutions": {
        "solid-js": "1.4.x"
    },

as well as

    "resolutions": {
        "solid-js": "1.4.8"
    },

but either way the end result is the same and the issue is not fixed.

To reproduce

git clone git@github.com:lume/lume.git
cd lume
git checkout 11f5894239959d021046fe158a92284559c1d7cb

For reference, here's that commit in GitHub UI: https://github.com/lume/lume/tree/11f5894239959d021046fe158a92284559c1d7cb

Then run this to set up git submodules and bootstrap with yarn:

npm run fresh

Finally observe the installed solid-js versions:

cd apps/website/
find . | grep -E "solid-js$" | xargs -I F sh -c 'cat F/package.json | grep version'

I used the find command because neither yarn ls solid-js or npm ls solid-js show the installed versions of solid-js.

See apps/website/package.json, which has the resolutions field in it (even if resolutions worked it shouldn't be needed)

Environment

System:
    OS: macOS 13.4
    CPU: (8) arm64 Apple M2
  Binaries:
    Node: 20.6.1 - /private/var/folders/7y/wy4mhdj114g5xj1ktvh6hdz80000gn/T/xfs-5221ff21/node
    Yarn: 4.0.0 - /private/var/folders/7y/wy4mhdj114g5xj1ktvh6hdz80000gn/T/xfs-5221ff21/yarn
    npm: 10.2.0 - ~/.npm-packages/bin/npm
    pnpm: 7.24.3 - ~/.npm-packages/bin/pnpm

Additional context

clemyan commented 8 months ago

Yarn does not dedupe implicitly, because that could update transitive deps and break things. Instead, dedupe must be done explicitly with the yarn dedupe command. But that does not affect your case either. Let me explain.

A fundamental design choice of Yarn is that within the same project (i.e. across all workspaces and transitive dependencies), the same descriptor (e.g. solid-js@^1.0.0) must resolve to the same reference (e.g. solid-js@npm:1.8.5). In your case, solid-js@^1.0.0 is resolved to solid-js@npm:1.8.5 already because it is used elsewhere in your project. Specifically, via the first-person-shooter workspace, which you can determine with the yarn why command

> yarn why -R solid-js

--- snip ---

├─ first-person-shooter@workspace:apps/first-person-shooter
│  ├─ classy-solid@npm:0.1.2 (via npm:0.1.2)
│  │  └─ solid-js@npm:1.8.5 (via npm:^1.3.13)
│  ├─ lume@npm:0.3.0-alpha.10 (via npm:0.3.0-alpha.10)
│  │  └─ @lume/element@npm:0.6.5 [f5eb3] (via npm:^0.6.0 [f5eb3])
│  │     ├─ @lume/variable@npm:0.7.3 (via npm:^0.7.1)
│  │     │  └─ solid-js@npm:1.8.5 (via npm:^1.0.0)
│  │     └─ solid-js@npm:1.8.5 (via npm:^1.0.0)
│  ├─ solid-js@npm:1.3.14 (via npm:1.3.14)
│  └─ @lume/element@npm:0.6.5 [f5eb3] (via npm:^0.6.0 [f5eb3])

--- snip ---

└─ website@workspace:apps/website
   ├─ lume@npm:0.3.0-alpha.9 (via npm:0.3.0-alpha.9)
   │  └─ @lume/element@npm:0.6.5 [f5eb3] (via npm:^0.6.0 [f5eb3])
   ├─ solid-js@npm:1.4.8 (via npm:1.4.x)
   └─ @lume/element@npm:0.6.5 [f5eb3] (via npm:^0.6.0 [f5eb3])

If you are fine with downgrading all instances of solid-js@^1.0.0 in the project then you can do it with yarn set resolution solid-js@npm:^1.0.0 npm:1.4.8

This is also why your feature request is kinda clunky. A resolution in a workspace package.json either

trusktr commented 8 months ago

I understand the premise and appreciate its value, but I think there might still be a bug (or missing feature) here because:

With these data points, there does not seem to be a reason to avoid deduping packages within website on its own, and within first-person-shooter on its own, just like would happen in standalone projects installed with npm install.

I'm not aware of how yarn.lock works, but maybe it needs to be able to hold a little more information to allow workspaces that are essentially-standalone projects to be deduped on their own automatically.

I think this is very expected by anyone who will encounter this, and people in such cases won't be able to intuitively reason as to why in-range versions are not deduped apart from "Yarn just doesn't dedupe". I.e. there's no real intuitive reason why the packages need to be duped and why manual deduping has to be performed.

Without deduping whenever possible, there's also the risk that projects that depend on singletons (front-end reactive projects, reactive state libs, etc) will simply break, which causes the requirement for manual intervention seem even less of a good out-of-the-box behavior.

I reported this as "bug" because its fairly standard behavior for npm, and in many ways people expect alternative tools to follow some basic semver conventions that npm defined.

I've verified that npm automatically dedupes (but also undesirably hoists) solid-js in the OP example by replacing the reproduction step

npm run fresh

with

rm -rf packages/docsifyjs+docsify yarn.lock && npm i

With Yarn, even if I first delete yarn.lock before yarn install it still does not dedupe.

Over in

various people are stumbling on npm's lack of hoisting control now that Lerna 7+ no longer installs and links workspaces and instead directs people to use npm, yarn, or pnpm for that purpose. Those people that are starting to trickle over from Lerna may hit this issue next.

Should this be converted into an enhancement ticket?

arcanis commented 8 months ago

Should this be converted into an enhancement ticket?

No, this is a core behavior of Yarn that's not going to change as a default. New installs are additive, not destructive. Running yarn install will add the new ranges that were missing from your lockfile, but it won't change those which are already there. It's by design, and has worked successfully for the last 7 years.

If you wish to dedupe your project you can run yarn dedupe, which is here for that. At best we could think of an option that would make Yarn automatically run it post-install, but that'd not be the default.

Additionally, note that dedupe may come in multiple strategies, some more destructive than others, so we definitely want the users to be explicit about what level of risk they can tolerate.

clemyan commented 8 months ago

You understand apps/website and apps/first-person-shooter are two workspaces of the same project and both of them transitively depends on solid-js@npm:^1.0.0, right? Then you can see these two are mutually exclusive:

If we accept that the "same resolution" constraint is valuable and should not be changed, then solid-js@npm:^1.0.0 must resolve to the same reference in both apps/website and apps/first-person-shooter. It can be npm:1.4.18, or npm:1.8.5 or even npm:2.0.0 if you like (you can force it via resolutions or yarn set resolution), but they cannot differ. In Yarn, resolution is a project-wide operation.

Without deduping whenever possible, there's also the risk that projects that depend on singletons (front-end reactive projects, reactive state libs, etc) will simply break, which causes the requirement for manual intervention seem even less of a good out-of-the-box behavior.

If a package depends on singleton packages, it should declare peerDependencies instead of dependencies. The semantics of "dependencies": { "react": "^18.0.0" } only guarantees you will get some instance of react in the 18.x line, while the semantics "peerDependencies": { "react": "^18.0.0" } guarantees that if your parent can resolve react, you will resolve to the same instance of react.

npm can also duplicate packages, but it is much less predictable when and how it happens. Simply upgrading a direct dependency of your project can cause previously deduplicated packages anywhere in the dependency tree to duplicate.

Deduping whenever possible will also break some projects because adding a direct dependency can also upgrade or downgrade packages anywhere in the dependency tree.

The takeaway is that, if packages do not properly declare their dependencies, then deduping implicitly and deduping only explicitly both causes problems. But deduping only explicitly is less problematic because you know what and how it is happening.

its fairly standard behavior for npm

Not saying it is npm's fault -- if everyone follows semver, properly declares their dependencies, and not rely on deduplicated packages when they were never guaranteed, etc. then those problems should not happen even in npm. Yarn just values stability and predictability more than npm within the confines of the semantic guarantees.

By your logic, not having hoisting controls is also a standard behavior of npm so yarn and pnpm should not have those either.

some basic semver conventions that npm defined

The semantic guarantees are like I said above -- that "dependencies": { "react": "^18.0.0" } only guarantees you will get some instance of react in the 18.x line. As far as I know there has never been guarantees/specs regarding deduplication behavior.

trusktr commented 7 months ago

I get it. You're saying it works the way it works. I'm just saying just because it worked that way for 7 years doesn't make it the best way.

I want to rely on semver versioning benefits, and yarn prevents that out of the box (but now I know, so I have to make sure I run dedupe manually to avoid surprises).

trusktr commented 7 months ago

I think there's definitely a bug. There's absolutely no reason for solid-js to be duplicated within the apps/website/ folder (that I'm aware of), and yarn dedupe does not fix it.

Reproduction:

git clone git@github.com:lume/lume.git
cd lume
git checkout 11f5894239959d021046fe158a92284559c1d7cb
npm run fresh # this runs yarn install after cloning git submodules
corepack yarn dedupe # expect this to do the right thing
cd apps/website/
find . | grep -E "solid-js$" | xargs -I F sh -c 'cat F/package.json | grep version'

The output still shows non-deduped solid-js packages.

This should be deduping because:

Even with yarn dedupe (at the top level of the super repo) the solid-js packages inside of apps/website/ are not deduped, and they clearly can be.

apps/first-person-shooter/ has nothing to do with apps/website/ and should not influence how solid-js is deduped inside of apps/website/.

This is what I described above, but now I've described it in a different way.

I'm fairly sure this is a bug (unless there's something inside of apps/website/ we have overlooked, and again there's absolutely no reason anything in apps/first-person-shooter/ should be dictating why apps/website/ packages are not being deduped).

arcanis commented 7 months ago
  • The solid-js version range in apps/website/node_modules/@lume/variable/package.json is ^1.0.0
  • The solid-js version range in apps/website/node_modules/@lume/variable/package.json is ^1.0.0

Which resolves to 1.8.5:

https://github.com/lume/lume/blob/develop/yarn.lock#L24771-L24772

As mentioned in the yarn dedupe documentation (emphasis mine):

This command dedupes dependencies in the current project using different strategies (only one is implemented at the moment):

  • highest: Reuses (where possible) the locators with the highest versions. This means that dependencies can only be upgraded, never downgraded. It's also guaranteed that it never takes more than a single pass to dedupe the entire dependency tree.

By default dedupe never downgrades, it only ever upgrades. In general that's not a problem, except when ^ and ~ ranges are mixed together (which is effectively the case with 1.4.x, which is synonymous with ~1.4).

Ideally we would have a yarn dedupe mode that allows downgrading, but that isn't implemented at the moment.