Open remorses opened 4 years ago
I just did on a very small scale example, and there is already a significant impact of yarn berry that I did not expect.
https://github.com/crubier/yarn-benchmarks
For example, on a simple create-react-app + storybook, build storybook takes 23s with yarn berry, VS 12s with npm.
I think we suffer from this internally, on our big monorepo with many storybooks, the storybooks take a total of 90 min to build.... reducing to 45min would already help a lot...
@crubier Have you tried to check if the impact in Yarn berry due to zip compression, i.e. compessionLevel
Yarn berry option
@larixer no I didn't know about this option. Let me try it now..
BTW What is the reason for compressing the cache? Is it only for saving space on the hard drive?
@crubier Yep, to save space on the hard drive or in the Git repo
Putting compessionLevel: 0
improves performance by around 0-10%, not much.
For example storybook goes from 21s-25s to 19s-24s
build storybook takes 23s with yarn berry, VS 12s with npm.
That's not a good comparison, what you actually want to compare is PnP vs node_modules in V2
@crubier Yep, to save space on the hard drive or in the Git repo
I think the issue raised by @remorses here is a very good idea. I think the runtime performance impact of yarn on real-life benchmark has been very overlooked, and now might be the time to start thinking about it a bit more. On my part for example, I dont care about saving a few GB on my disk if this meant getting faster performance.
An example of the overlooking of runtime performance was https://github.com/yarnpkg/berry/issues/973 , which was mostly resolved thankfully (on our monorepo there is now "only" a delay of 7s before each yarn command, instead of 5minutes...)
What surprised me in this little benchmark is that yarn berry seems to have a 150-200% performance impact on many real life scripts.
That's not a good comparison, what you actually want to compare is PnP vs node_modules in V2
And yes, I include PnP when I say Yarn v2. The point of this issue is mostly about PnP I think.
I could add yarn v1 with pnp and yarn v2 without pnp in the benchmark.
Your install benchmark isn't useful the way it's currently done, since it doesn't take into account if the archives are already in the cache or if the lockfile is already generated. If you need an example of a good install benchmark, here is ours; the results are here.
I could add yarn v1 with pnp
That's wouldn't be useful, as Yarn 1's PnP was just an experiment and not production ready.
yarn v2 without pnp in the benchmark.
Yes, Yarn 2 with the node-modules
linker would be useful.
@paul-soporan
Your install benchmark isn't useful the way it's currently done, since it doesn't take into account if the archives are already in the cache or if the lockfile is already generated.
I know about that of course, I ran the benchmark only after installing in each folder (And the benchmark is ran several times in a row without cleanup). So it's a "reinstall" benchmark, not a "first install" benchmark.
npm and Yarn v1 are irrelevant imo, what we're interested in is the performance overhead of Yarn V2's PnP mode vs node_modules. npm and Yarn v1 has hoisting bugs which can give them unfair advantages (v2 as far as we know, doesn't)
I think the runtime performance impact of yarn on real-life benchmark has been very overlooked
That's a false statement, i've done optimizations on real world usecases, for example:
You can see all the performance related PRs i've done here https://github.com/yarnpkg/berry/pulls?q=is%3Apr+author%3Amerceyz+perf+is%3Aclosed
I know about that of course, I ran the benchmark only after installing in each folder. So it's a "reinstall" benchmark, not a "first install" benchmark.
The only case our benchmarks doesn't cover is runtime overhead of PnP vs node_modules, the rest we cover already https://p.datadoghq.eu/sb/d2wdprp9uki7gfks-c562c42f4dfd0ade4885690fa719c818
@merceyz "overlooked" does not mean "completely forgotten":
In the past 3 months you made at least 2 PRs that each improved by 100% the performance on certain use cases, this is very good and going in the right direction, but it also shows that there are still wide margins for improvement.
As I mentioned I also made a PR 7 months ago ( https://github.com/yarnpkg/berry/pull/998 ) which improved by 2500% the runtime performance on large monorepos. There again, there was large margin for improvement of the runtime performance.
My only point is that we should indeed start measuring this runtime performance improvement because this is now becoming the limiting factor for us internally.
@merceyz
Comparing npm with yarn berry without pnp gives indeed smilar results.
So yes, as expected, we can compare yarn pnp vs yarn node_modules, and we get:
berry
build-storybook ok
real 0m24.025s
user 0m30.199s
sys 0m2.798s
berry-no-pnp
build-storybook ok
real 0m13.496s
user 0m13.646s
sys 0m2.200s
I have just run yarn run --inspect start-storybook
on a workspace with 250 packages and made a profile, it turns out traverseDependencyTree
takes 72% of the time
traverseDependencyTree
's time complexity increases with number of dependencies, this is not a good idea, we should find another way
@remorses Out of curiosity, where does traverseDependencyTree
come from? We don't have such a function inside the PnP hook and it doesn't appear when I profile https://github.com/crubier/yarn-benchmarks/tree/master/single-berry. Is it from the fsevents
patch? Is it from another dependency?
It comes from .yarn/unplugged/fsevents-patch-5aa22a6c1e/node_modules/fsevents/vfs.js
It is in plugin-compat
in the source code
@remorses Indeed this algorithm is very inefficient because of seen.delete
, it can run for ages on some dependency graphs. It can be rewritten in much more efficient manner
Regarding the initial topic, yep, we currently track install times but not run times. It would be a good idea to have a dashboard for runtimes as well 👍 Also, we should put the dashboard link somewhere on the website 🤔
I updated my benchmark repo to start performing benchmark on monorepos at runtime.
https://github.com/crubier/yarn-benchmarks
Here are some quick results:
That's a x4 performance penalty by doing the same thing, but in a monorepo (without importing dependencies from the monorepo, just by being in the monorepo), and using PnP (because it doesn't work without it anyway, probably storybooks fault again.)
This seems to match with what @remorses says, the 4x performance hit seems to match the 72% of the time spent in traverseDependencyTree
.
So the good news is that this particular problem seems fixable! I think I can help, but the file in question lacks documentation so it's hard to know where to start
@larixer Do you intend to open a PR improving the performance of traverseDependencyTree
from the fsevents patch? (Asking you since you're the one that discovered the problem of that algorithm)
@paul-soporan To be fair I'm busy now with nohoist
so I don't plan to hack traverseDependencyTree
anytime soon. I think removing this line seen.delete
should be enough, but someone needs to double check that of course.
@crubier
Storybook with Yarn v2 + Node Module resolution + Monorepo with 200 packages = FAIL to build 🔴
This worries me, a bit :) NM_DEBUG_LEVEL=1 yarn
on that Monorepo throws or not?
NM_DEBUG_LEVEL=1
performs self-checking if produced node_modules install tree is correct
NM_DEBUG_LEVEL=2
performs self-checking and also outputs the reasons why packages were not hoisted to the top node_modules
➜ NM_DEBUG_LEVEL=1 yarn
➤ YN0000: ┌ Resolution step
➤ YN0002: │ multi@workspace:. doesn't provide @testing-library/dom@>=5 requested by @testing-library/user-event@npm:7.2.1
➤ YN0002: │ multi@workspace:. doesn't provide webpack@>=2 requested by babel-loader@npm:8.1.0
➤ YN0002: │ react-color@npm:2.18.1 doesn't provide react@* requested by @icons/material@npm:0.2.4
➤ YN0002: │ @mdx-js/loader@npm:1.6.16 doesn't provide react@^16.13.1 requested by @mdx-js/react@npm:1.6.16
➤ YN0002: │ @storybook/addon-essentials@npm:6.0.21 [3c62c] doesn't provide @babel/core@^7.0.0-0 requested by @storybook/addon-docs@npm:6.0.21
➤ YN0002: │ @storybook/preset-create-react-app@npm:3.1.4 [3c62c] doesn't provide typescript@>= 3.x requested by react-docgen-typescript-plugin@npm:0.5.2
➤ YN0002: │ react-docgen-typescript-loader@npm:3.7.2 [cbc10] doesn't provide webpack@^3.0.0 || ^4.0.0 requested by @webpack-contrib/schema-utils@npm:1.0.0-beta.0
➤ YN0002: │ @storybook/react@npm:6.0.21 [3c62c] doesn't provide typescript@>= 3.x requested by react-docgen-typescript-plugin@npm:0.5.2
➤ YN0002: │ foobar@workspace:packages/single-yarn doesn't provide @testing-library/dom@>=5 requested by @testing-library/user-event@npm:7.2.1
➤ YN0002: │ foobar@workspace:packages/single-yarn doesn't provide webpack@>=2 requested by babel-loader@npm:8.1.0
➤ YN0000: └ Completed in 2.16s
➤ YN0000: ┌ Fetch step
➤ YN0000: └ Completed in 1.49s
➤ YN0000: ┌ Link step
hoist: 954.502ms
➤ YN0000: └ Completed in 7.16s
➤ YN0000: Done with warnings in 11.25s
Then
➜ cd package/the-package-to-build-storybook-on
➜ yarn build-storybook
info @storybook/react v6.0.21
info
info clean outputDir..
info => Copying static files from: public
info => Copying prebuild dll's..
info => Building manager..
info => Loading manager config..
info => Loading presets
info => Compiling manager..
info => manager built (5.03 s)
info => Building preview..
info => Loading preview config..
info => Loading presets
info => Loading config/preview file in "./.storybook".
info => Loading config/preview file in "./.storybook".
info => Adding stories defined in ".storybook/main.js".
ERR! Failed to resolve a `react-scripts` package.
info => Using default Webpack setup.
info => Compiling preview..
ERR! => Failed to build the preview
ERR! ./src/stories/assets/code-brackets.svg 1:0
ERR! Module parse failed: Unexpected token (1:0)
ERR! You may need an appropriate loader to handle this file type, currently no loaders are configured to process this file. See https://webpack.js.org/concepts#loaders
ERR! > <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="48" height="48" version="1.1" viewBox="0 0 48 48"><title>illustration/code-brackets</title><g id="illustration/code-brackets" fill="none" fill-rule="evenodd" stroke="none" stroke-width="1"><path id="Combined-Shape" fill="#87E6E5" d="M11.4139325,12 C11.7605938,12 12,12.5059743 12,13.3779712 L12,17.4951758 L6.43502246,23.3839989 C5.85499251,23.9978337 5.85499251,25.0021663 6.43502246,25.6160011 L12,31.5048242 L12,35.6220288 C12,36.4939606 11.7605228,37 11.4139325,37 C11.2725831,37 11.1134406,36.9158987 10.9453839,36.7379973 L0.435022463,25.6160011 C-0.145007488,25.0021663 -0.145007488,23.9978337 0.435022463,23.3839989 L10.9453839,12.2620027 C11.1134051,12.0841663 11.2725831,12 11.4139325,12 Z M36.5860675,12 C36.7274169,12 36.8865594,12.0841013 37.0546161,12.2620027 L47.5649775,23.3839989 C48.1450075,23.9978337 48.1450075,25.0021663 47.5649775,25.6160011 L37.0546161,36.7379973 C36.8865949,36.9158337 36.7274169,37 36.5860675,37 C36.2394062,37 36,36.4940257 36,35.6220288 L36,31.5048242 L41.5649775,25.6160011 C42.1450075,25.0021663 42.1450075,23.9978337 41.5649775,23.3839989 L36,17.4951758 L36,13.3779712 C36,12.5060394 36.2394772,12 36.5860675,12 Z"/><rect id="Rectangle-7-Copy-5" width="35.57" height="4" x="5.009" y="22.662" fill="#A0DB77" rx="2" transform="translate(22.793959, 24.662305) rotate(-75.000000) translate(-22.793959, -24.662305)"/></g></svg>
ERR! @ ./src/stories/Introduction.stories.mdx 9:0-46 204:9-13
ERR! @ ./src sync ^\.(?:(?:^|\/|(?:(?:(?!(?:^|\/)\.).)*?)\/)(?!\.)(?=.)[^\/]*?\.stories\.mdx)$
ERR! @ ./.storybook/generated-stories-entry.js
... And 1000s of similar errors
The output of NM_DEBUG_LEVEL=2 yarn
is 6700 lines long so I will spare you this. You can check it on my repo in the folder multi-yarn-nm
Storybook doesn't always work in monorepos, it depends on the hoisting layout you get. PR here https://github.com/storybookjs/storybook/pull/11753
@larixer indeed removing seen.delete
seems to make the situation better without breaking anything as far as I know.
After removing seen.delete()
the situation is:
So to recap: removing seen.delete()
makes the situation better for monorepos, by removing a large part of the monorepo-induced overhead spent in traverseDependencyTree
as hown by @remorses. But pnp (monorepo or not) still has a ~2x performance penality as compared to node module)
@crubier Looks like we have another bottleneck then. seen.delete
affects only large dependency tree, but there is no effect on small dependency trees which is expectable
I opened a PR for the seen.delete
part of the problem @larixer , I'll run tests locally, is there anything else I can do to gain confidence that it does not break things ?
@crubier I think it should be all fine, This method just registers all the locations, so visiting graph nodes exactly once should be enough. You should also run yarn version check -i
on your PR and select patch
for plugin-compat
and for yarnpkg-cli
and possibly add an entry to CHANGELOG.md
@crubier One observation that bothers me is that from the same set of source files the storybook produces output files that significantly differ in size! Compare output files for node_modules
and for pnp
I'm curious of why fsevents
is involved in a build, where file watching shouldn't happen
@larixer
One observation that bothers me is that from the same set of source files the storybook produces output files that significantly differ in size! Compare output files for node_modules and for pnp
This is something I have noticed on my internal monorepo as well.
Here is the webpack bundle analyzer output:
For yarn-node-modules
For yarn-pnp
The problem seems to be that yarn/cache
ends up in the bundle, while only yarn/$$virtual
should actually end up there.
@merceyz since you seem to know a bit about storybook maybe you have an idea? Again this is default create-react-app + default storybook
@crubier I see a problem with Storybook/Webpack not been able to delegate to sb_dll/*_dl.js files under PnP and thus recompiling all the stuff it needs from node_modules again and again. To check this you can create .storybook/yarn-preset.js
:
async function yarn2Config(config, options) {
const newConfig = {
...(config || {})
};
newConfig.optimization = {minimize: false, splitChunks: false, runtimeChunk: false};
newConfig.mode = 'development';
return newConfig;
}
module.exports = { managerWebpack: yarn2Config, webpack: yarn2Config, webpackFinal: yarn2Config };
basically turn off all optimizations, switch to development mode and generate single bundle.
You need to modify .storybook/main.js
and add "presets": [require.resolve("./yarn-preset")]
Run with this preset both NM and Yarn PnP, then diff produced bundles. You will see in the diff something like this:
-/***/ "./node_modules/@emotion/core/dist/core.browser.esm.js":
-/*!************************************************************************************************!*\
- !*** delegated ./@emotion/core/dist/core.browser.esm.js from dll-reference storybook_docs_dll ***!
- \************************************************************************************************/
+/***/ "./.yarn/$$virtual/@emotion-core-virtual-283d648f44/0/cache/@emotion-core-npm-10.0.35-565218757c-5
+/*!*****************************************************************************************************
+ !*** ./.yarn/$$virtual/@emotion-core-virtual-283d648f44/0/cache/@emotion-core-npm-10.0.35-565218757c-5
+ \*****************************************************************************************************
e.g. NM bundle delegates many functions to DLLs, but PNP bundle compiles all of them into bundle
@larixer Oh, the storybook is distributing prebuilt DLLs:
info => Copying prebuild dll's..
inside DLLs you can see:
!*** /Users/shilman/projects/baseline/storybook/node_modules/@storybook/semver/internal/debug.js ***!
This is a suboptimal decision, because this way the DLLs are dependent on hoisting layout, so they will be reused to a less degree by the package manager that is different from the one used to build the DLLs, even if package manager is the same, but only the version is different the hoisting layout will be different. And of course node_modules prebuilt DLLs won't be used by PnP at all.
They should at least have an option to build dll on a user side. But the functionality like this will not be added anytime soon I think.
You can try disabling DLL usage via --no-dll
and configure Webpack to use some compilation caching plugin like hard-source-webpack-plugin
instead. I think this way the effect will be more or less the same as with DLLs, but more reliable
@larixer I just tried, and using --no-dll
succeeds in making both yarn-nm
and yarn-pnp
similar in terms of bundle size, but probably not as you (or at least I) expected.
It does not make the yarn-pnp
bundle smaller (it does not change anything for pnp basically), but by it make the yarn-nm
bundle larger... So it's not really a solution..
It's an original caching, I wonder if it could be made compatible with PnP, perhaps some with extra Webpack config 🤔 Probably worth opening an issue on the Storybook repository rather than here, since it's not actionable on our side (let's keep this issue here about building a runtime perf dashboard).
cc @gaetanmaisse
@crubier --no-dll
places both NM and PnP into the same conditions, after that their perfs are the same for me. The "solution" would be to use build caching that is not hardcoded to node_modules, for example hard-source-webpack-plugin
or Webpack 5 or ... some other caching solution that works both for PnP and for NM
@arcanis to go back to the idea of a runtime benchmark, here are a few requirements for a good benchmark in my opinion, which I started to implement on my repo https://github.com/crubier/yarn-benchmarks :
yarn echo
)yarn build
)yarn jest
at the root, with many jest "projects")yarn lerna run build
)The tests above should allow to benchmark different kinds of overheads that we have seen:
yarn echo "ok"
on 200 packages, it takes 20 minutes)yarn build
a complex package, it takes 4 times the time it would take without pnp)plugin-compat
or other "hacks" like this. For example rewriting fs
or glob
has a significant performance impact in some casesBesides the pnp resolution, I would also love to see a benchmark for the patched fs layer
Describe the user story
I saw that the yarn
pnp.js
file extends some nodejs native functionality likerequire
,path
andfs
functionsTo prevent performance degradation we should make benchmark and stress tests that measure the runtime overhead, for example creating a bundle with webpack of a big project with at least 50 packages
Other performance degradations could be in glob searches
I am also scared that some of these functions reimplementations have time complexity that grows with the workspace packages you have, some of us have workspaces with 200 packages
Describe the solution you'd like
Implement a benchmark to measure the runtime overhead
Describe the drawbacks of your solution
Describe alternatives you've considered
Additional context
It is still not very clear to me why
yarn
reimplementsfs
, it would be cool to read more about it