Open devversion opened 2 years ago
This issue reproduces on master:
Error: expect(received).resolves.toBeTruthy()
Received promise rejected instead of resolved
Rejected to value: [Error: Command failed: /usr/bin/node /github/workspace/scripts/actions/../run-yarn.js install
➤ YN0000: ┌ Resolution step
::group::Resolution step
➤ YN0001: │ Error: my-test-pkg@file:/tmp/tmp-18a6mLA1rPZdRa/my-tar-pkg.tgz::locator=root-workspace-0b6124%40workspace%3A.: Manifest not found
at Function.find (/github/workspace/packages/yarnpkg-core/sources/Manifest.ts:114:13)
at /github/workspace/packages/plugin-file/sources/TarballFileResolver.ts:70:14
at Object.releaseAfterUseAsync (/github/workspace/packages/yarnpkg-core/sources/miscUtils.ts:161:12)
at TarballFileResolver.resolve (/github/workspace/packages/plugin-file/sources/TarballFileResolver.ts:69:22)
at MultiResolver.resolve (/github/workspace/packages/yarnpkg-core/sources/MultiResolver.ts:57:12)
at MultiResolver.resolve (/github/workspace/packages/yarnpkg-core/sources/MultiResolver.ts:57:12)
at /github/workspace/packages/yarnpkg-core/sources/Project.ts:722:18
at Object.prettifyAsyncErrors (/github/workspace/packages/yarnpkg-core/sources/miscUtils.ts:172:12)
at startPackageResolution (/github/workspace/packages/yarnpkg-core/sources/Project.ts:721:29)
::endgroup::
➤ YN0000: └ Completed in 0s 528ms
➤ YN0000: Failed with errors in 0s 541ms
]
at expect (/github/workspace/.yarn/cache/expect-npm-24.8.0-8c7640c562-0c0da74930.zip/node_modules/expect/build/index.js:138:15)
at module.exports (evalmachine.<anonymous>:15:7)
at async /github/workspace/.yarn/cache/@arcanis-sherlock-npm-2.0.3-558f52b79f-ccfa417b92.zip/node_modules/@arcanis/sherlock/lib/executeRepro.js:57:13
at async executeInTempDirectory (/github/workspace/.yarn/cache/@arcanis-sherlock-npm-2.0.3-558f52b79f-ccfa417b92.zip/node_modules/@arcanis/sherlock/lib/executeRepro.js:18:16)
at async executeRepro (/github/workspace/.yarn/cache/@arcanis-sherlock-npm-2.0.3-558f52b79f-ccfa417b92.zip/node_modules/@arcanis/sherlock/lib/executeRepro.js:25:12)
at async ExecCommand.execute (/github/workspace/.yarn/cache/@arcanis-sherlock-npm-2.0.3-558f52b79f-ccfa417b92.zip/node_modules/@arcanis/sherlock/lib/commands/exec.js:26:38)
at async ExecCommand.validateAndExecute (/github/workspace/.yarn/cache/clipanion-npm-2.0.0-rc.16-b9444aaf89-91cf93ba72.zip/node_modules/clipanion/lib/advanced/Command.js:161:26)
at async Cli.run (/github/workspace/.yarn/cache/clipanion-npm-2.0.0-rc.16-b9444aaf89-91cf93ba72.zip/node_modules/clipanion/lib/advanced/Cli.js:74:24)
at async Cli.runExit (/github/workspace/.yarn/cache/clipanion-npm-2.0.0-rc.16-b9444aaf89-91cf93ba72.zip/node_modules/clipanion/lib/advanced/Cli.js:83:28)
Out of curiosity, what tool generates tarballs like this? It sounds a little invalid
@arcanis It's really just the standard GNU tar
command (see example above) that will generate such archive members, depending on whether you pass it absolute filenames, or relative like this (which can happen if you use find
together or something).
Generally, these are valid entry/archive member names for tar files, it's just that for tar component stripping, normalization which would collapse ./blub
to blub
is unexpected. Libraries like tar-fs
does not do this by default either, so this seems like the ecosystem standard with respect to tarballs. Yarn 1.x did not break for such tarballs in that case, especially not with a rather confusing error message as in this issue (see above; it's not clear why this happens and took me quite some time to debug).
Really my case as a concrete example: We are using Bazel as build tool for Angular, and tarballs generated with Bazel through rules_pkg
currently come with ./
as the first component, breaking our integration testing to prepare for Yarn berry/PnP. It's clear that we could somehow tweak the package setup to use something else as first component (like package/
), but the mismatch for tarballs in Yarn 1.x and Yarn 2.x came as a surprise (with a hard-to-debug message)
My problem with that is that paths inside tar archives are paths, so normalizing them shouldn't make a difference 🤔
I understand that it worked in Yarn Classic (and perhaps other package managers), but it sounds like a bug, and I'm not sure I want to follow this behaviour. I myself had to implement a few security-related fixes in Yarn Classic due to how tarball paths could lead to escapes during installs, and the more broken assumptions we have the more likely it is that we leave open an attack surface that can be exploited somehow.
Really my case as a concrete example: We are using Bazel as build tool for Angular, and tarballs generated with Bazel through rules_pkg currently come with ./ as the first component, breaking our integration testing to prepare for Yarn berry/PnP. It's clear that we could somehow tweak the package setup to use something else as first component (like package/), but the mismatch for tarballs in Yarn 1.x and Yarn 2.x came as a surprise (with a hard-to-debug message)
I'd prefer if rules_pkg
was following a more "standard" tarball pattern. If you call tar
manually to generate your tarballs, then it'd be probably for the best to follow the same layout as yarn pack
would generate (although I'm curious - why not call yarn pack
directly?).
My problem with that is that paths inside tar archives are paths, so normalizing them shouldn't make a difference 🤔
Hmm. Why aren't paths not normalized as part of the creation then? It surely makes a difference for the stripping mechanism. The normalization does not occur in other tar
implementations (in creation neither in extraction). That seems like the convention across implementations (including the GNU tar implementation itself). I haven't found anything in the GNU tar spec that elaborates much on the ./<>
case, neither it being necessarily talking about paths, but rather about arbitrary archive member names.
I do acknowledge the security aspect you mention, but that one can be easily solved by normalizing paths just for the jail mechanism, as a side note.
I'm not sure it's a bug for tars because you could pretty easily end up having such archive members, given my basic GNU tar example for Sherlock. The error from Yarn is correct in terms of not finding a manifest but that is still super confusing if you pass a valid tar archive that works in Yarn 1.x, NPM etc. - I personally don't see any strong reason to deviate there, especially with the --strip-components
mechanism (working with ./
) being well established in the ecosystem as it seems?
I'd prefer if rules_pkg was following a more "standard" tarball pattern.
Is it really a standard pattern to not have .
as a leading component? I think that will be hard to answer (as said it's fairly straightforward to end up with ./<XX>
via command line), but generally I agree that the "member convention" from yarn pack
would be obviously better for Yarn. Invoking Yarn from Bazel has the consequence of increasing the build graph a little more than invoking just the system tar
/ or just rules_pkg
(although that is not a good argument; conceptually the deviation in Yarn berry seems more surprising; that's why I submitted the issue)
Thanks for taking the time to look into this issue though, regardless of the outcome!
Lost a prior answer 😅
Invoking Yarn from Bazel has the consequence of increasing the build graph a little more than invoking just the system
tar
/ or justrules_pkg
(although that is not a good argument; conceptually the deviation in Yarn berry seems more surprising; that's why I submitted the issue)
Yes, but yarn pack
(and to a lesser extend similar commands from other package managers, like npm pack
) does more than just call tar
. It also processes the .gitignore
/ files
field, it replaces the main
/ module
/ exports
fields according to the definition in publishConfig
, it validates the consistency of the project†, etc. And of course it guarantees that the output can be installed when referenced via the file:
protocol, making it the most portable thing to use as there's otherwise no formal spec on what a packed package should look like.
† It doesn't do that yet, but I'd like to implement it in the next major
The error from Yarn is correct in terms of not finding a manifest but that is still super confusing if you pass a valid tar archive that works in Yarn 1.x, NPM etc. - I personally don't see any strong reason to deviate there, especially with the
--strip-components
mechanism (working with./
) being well established in the ecosystem as it seems?
It's mostly just a matter of risk assessment. I've seen firsthand the kind of bugs and security risks that this type of "accidentally working" unplanned behaviours can cause in older package managers, and for modern Yarn releases I tend to reevaluate them on their own to see if they're truly necessary. Sometimes they are, sometimes they can be fixed/changed easily enough that dropping them is perhaps a good idea to keep Yarn stable. I have a feeling it's the case here 🤔
Yes, but yarn pack (and to a lesser extend similar commands from other package managers, like npm pack) does more than just call tar. It also processes the .gitignore / files field, it replaces the main / module / exports fields according to the definition in publishConfig, it validates the consistency of the project†, etc. And of course it guarantees that the output can be installed when referenced via the file: protocol, making it the most portable thing to use as there's otherwise no formal spec on what a packed package should look like.
That makes sense. I'd argue that all of these additional things are not strictly necessary depending on the scenario, but generally I'd agree that it definitely makes sense to prefer yarn pack
. It's also fair to say that only the output from yarn pack
is compatible with Yarn's implementation of the file:*.tgz
protocol.
I don't believe there would be any security impact if the security checks would remain conceptually the same, but just the member extraction would match with npm:tar
, npm:tar-fs
, gnu:tar
etc. Definitely do-able, but as said, it's also fair to just say that yarn pack
is the accepted here. I'm good with that! It's just a little unfortunate to see this subtle difference in Yarn 1.x/v2 and with other package managers (we will have integration testing for)
Generally I think the best, given your points, would be to document the requirements of file:
a little more, and also potentially have a warning for such archive members (because the error message itself is insufficient IMO). What do you think?
Self-service
Describe the bug
Yarn Berry/v3 does not properly deal with tarballs generated without
yarn pack
, that previously worked as expected with Yarn 1.x classic.Assume we have a tar archive with entries like this:
Yarn 1.x always strips the first component away. Similarly Yarn berry attempts this but fails due to some normalization logic applied to deal with its portable paths as it seems. The result is that before stripping the first component, Yarn berry accidentally normalizes the path to just
package.json
and therefore the entry is being omitted as it will be considered a "first component".To reproduce
Environment
Additional context
No response