yarnpkg / yarn

The 1.x line is frozen - features and bugfixes now happen on https://github.com/yarnpkg/berry
https://classic.yarnpkg.com
Other
41.44k stars 2.72k forks source link

yarn audit --json produces large amounts of data #7404

Open nishils opened 5 years ago

nishils commented 5 years ago

Do you want to request a feature or report a bug? Bug

What is the current behavior? When running yarn audit against a couple of different projects and saving them to a file, the resulting file size is around 30 MB. When running against the same project but with the json flag, the file size is around 19 GB. A factor of 10x or even up to a 100x seems reasonable, but a ~633x seems like a bit much.

If the current behavior is a bug, please provide the steps to reproduce.

  1. Run yarn audit > sample.txt against this repo.
  2. Run ls -l sample.txt to view the file size.
  3. Run yarn audit --json > sample.json against the same repo.
  4. Run ls -l sample.json to view the file size.

What is the expected behavior? A file size similar to running yarn audit.

Please mention your node.js, yarn and operating system version. yarn v. 1.16.0, node.js v 12.6.0, up to date OSX,

pitgrap commented 5 years ago

We're running yarn audit --level=high --json nightly on our CI. It failed with the following error:

00:28:44 [ERROR] /home/ci/workspace/frontend/node/yarn/dist/lib/cli.js:92146
00:28:44 [ERROR]   throw err;
00:28:44 [ERROR]   ^
00:28:44 [ERROR] 
00:28:44 [ERROR] Error: write ENOBUFS
00:28:44 [ERROR]     at afterWriteDispatched (internal/stream_base_commons.js:78:25)
00:28:44 [ERROR]     at writevGeneric (internal/stream_base_commons.js:67:3)
00:28:44 [ERROR]     at Socket._writeGeneric (net.js:702:5)
00:28:44 [ERROR]     at Socket._writev (net.js:711:8)
00:28:44 [ERROR]     at doWrite (_stream_writable.js:408:12)
00:28:44 [ERROR]     at clearBuffer (_stream_writable.js:517:5)
00:28:44 [ERROR]     at onwrite (_stream_writable.js:465:7)
00:28:44 [ERROR]     at WriteWrap.afterWrite [as oncomplete] (net.js:791:19)

Looks similiar to this problem, because the output was huge.

tbezman commented 5 years ago

@nishils Step 3 should have --json right?

nishils commented 5 years ago

@tbezman yes, my bad. I have updated the original issue.

kathyn-sm commented 5 years ago

I am seeing heap out of memory errors with yarn audit --json as well. yarn audit works fine on the other hand.

> npm -v
6.9.0
> node -v
v10.16.0
> yarn -v
1.17.3
nishils commented 5 years ago

I think this error is related to this issue (https://github.com/facebook/jest/issues/8682). If other folks can confirm that they have this dependency in their problematic repos, then that would be very helpful.

The solution posted to run npm audit fix does not work for yarn. Yarn audit fix actually doesn't do anything (See this issue: https://github.com/yarnpkg/yarn/issues/7075)

There might be a way to filter test only packages in yarn audit (See issue https://github.com/yarnpkg/yarn/issues/6632)

Possibly yarn audit --groups dependencies I'll play around with this and report back here for other folks running into the same problem.

Another solution might be to create a npm package-lock.json file and convert it back to yarn.

npm install
npm audit fix --force # breaking changes
rm yarn.lock
yarn import
yarn audit
rm package-lock.json
kathyn-sm commented 5 years ago

Can confirm that the impacted repository with the error has jest.

However wouldn't this impact npm audit as well? npm audit works fine on the same repo that yarn is having issues with when having the --json flag.

nishils commented 5 years ago

I think npm does a better job of filtering paths. Yarn seems to be outputting every single path whereas npm is consolidating them somehow. That is my best guess at the moment.

It seems like if you run yarn audit --groups dependencies that excludes devDependencies and a few other types (Documentation here: https://yarnpkg.com/lang/en/docs/dependency-types/)

nishils commented 5 years ago

Just to close the loop on fixing the immediate issue, yarn audit upgrade jest@24.8.0 should get your scan times back to normal.

This issue is still relevant as the amount of data that yarn can potentially output is still way too high.

lzzluca commented 5 years ago

Having the same problem.

Imho as part of a bigger fix, would be nice to group for vulnerability: if two packages have got the same dependency, at the same version and that dependency has got a vulnerability, would make sense to see the vulnerability reported only once, instead of twice. Then a list of paths for that vulnerability, could give more details about which packages are vulnerable.

So, the idea would be to render the vulnerabilities list grouping by vulnerability and not by package. Even though sounds more like a feature than a bug fix.

Suggesting this because, for what I am seeing in a project I am working on, the same vulnerability seems reported many times; I guess that could be at least one reason that is making the json size increasing. What I am suggesting seems to be related to this: https://github.com/yarnpkg/yarn/issues/6500

PS for me yarn upgrade jest@24.8.0 (as suggested above) seems to fix it, but feels more like a workaround indeed

nishils commented 5 years ago

I totally agree but from statements made from the current maintainers, it seems that yarn audit work is being abandoned in favor of supporting it via plugins in the next generation of yarn. I'm guessing it will be a while before this does get resolved.

Honestly, the easiest solution is to switch to npm as npm seems to handle this more intelligently along with more audit features.

sarahLardeau commented 4 years ago

I ended up on this issue after getting a JavaScript heap out of memory error on the yarn audit --json command. After some digging, I found that a newly published vulnerability (https://www.npmjs.com/advisories/1490) caused more than 30000 "Known vulnerabilities" of low severity on my project. On that case, yarn audit --json > output.json was generating a huuuuuge output ! I stopped the process when the file was bigger than 200Go... Yeah 200Go, I double checked believe me... However, when running npm audit --json I did not get the same result, in fact, everything goes fine and this is how I found about the vulnerability. I looked a little closer on what was logged in yarn audit output and found out that there was a lot of duplicate data. I guess, this is why the output has become so big 🤔

Of course I could add some parmeters to yarn audit to filter low vulnerabilities or to check only the dependencies and not devDependencies. These would prevent theJavaScript heap out of memory error in my case. But it is not a the solution ^^ As there is no roadmap to improve the audit functionnality, I will use npm audit instead to do the job 😢

captrespect commented 3 years ago

adding --groups dependencies, greatly reduces the size. This allowed me to pipe it to yarn-audit-html which was failing my jenkins build without it.

yarn audit --level high --groups dependencies --json | yarn-audit-html

5.3 GB to 346KB.