vuejs / vitepress

Vite & Vue powered static site generator.
https://vitepress.dev
MIT License
11.47k stars 1.86k forks source link

feat: implement multithreading for rendering and local search indexing, use JSDOM for better section split performance and reliability. #3386

Open zhangyx1998 opened 4 months ago

zhangyx1998 commented 4 months ago

Before

  vitepress v1.0.0-rc.33

  ✓ building client + server bundles - 9.43 seconds
- ✓ rendering pages - 1 minute, 0.95 seconds
  ✓ generating sitemap - 0 seconds
  build complete in 1 minute, 10.56 seconds.
vp-single-thread-render

After

buildConcurrency = 10

  vitepress v1.0.0-rc.33

  ✓ building client + server bundles - 9.4 seconds
+ ✓ rendering pages - 42.26 seconds
  ✓ generating sitemap - 0.01 seconds
  build complete in 51.85 seconds.
vp-multi-thread-render

Challenges

In addition to vuetifyjs/vite-ssg@1f1663a666b3c5fdc30a543b5cf42737fc4b4722, this PR also need to handle the problem of passing siteConfig to worker threads. This is very challenging because we cannot just import userConfig multiple times for each worker, since userConfig may have side-effects or internal state dependencies.

After some research, I found that there does not exist such a package for this unique challenge, therefore I created one (rpc-magic-proxy) for it. This utility will (1) serialize siteConfig into a pure static object so it can be sent through RPC channel, and (2) deserialize it on the worker side as a proxy. It will proxy all the function calls back to the main thread. With only one side effect being all the function calls have to be "awaited".

Thanks to the asynchronous coding paradigm that have already been taken in this project, I did not encounter any issue plugging this proxy into the render function.

One thing worth mentioning is that, due to the RPC proxy overhead, the speedup is not as significant as seen in #3374. But it still brought 30% speedup when building large sites, so it can be a nice-to-have option.

Comments and insights needed!

Please feel free comment on possible problems and/or ways to improve parallelism!

zhangyx1998 commented 4 months ago

For discussions general parallelism, please go to #3183

zhangyx1998 commented 4 months ago

@brc-dd Do you have time to look at this?

brc-dd commented 4 months ago

Ah, I'm currently looking into other PRs. Will get back to you on this in few days.

zhangyx1998 commented 4 months ago

Ah, I'm currently looking into other PRs. Will get back to you on this in few days.

Sure, thanks!

zhangyx1998 commented 4 months ago

Latest perf:

  vitepress v1.0.0-rc.33

  ✓ building client + server bundles - 9.05 seconds
+ ✓ rendering pages - 25.24 seconds
  ✓ generating sitemap - 0.01 seconds
  build complete in 34.5 seconds.

This is more than 50% speedup compared to single thread (60.95s). I think this is good enough to be included as an experimental feature (not enabled by default).

brc-dd commented 4 months ago

The bundle step https://github.com/vuejs/vitepress/blob/09e48db355f530c7a138437004659b61239f4b75/src/node/build/bundle.ts#L149-L156 can be made async too. We should do something about the cache in https://github.com/vuejs/vitepress/blob/09e48db355f530c7a138437004659b61239f4b75/src/node/markdownToVue.ts#L23 though. In parallel client and server bundles, it doesn't get hit much. Also, LRU doesn't seem like a good idea there, as even in sync bundles, it won't get hit at all if there are 2048 or more pages.

zhangyx1998 commented 4 months ago

Is there a repo I can use for test? My own project do not have heavy deps so I have no way to test it.

zhangyx1998 commented 4 months ago

One low hanging fruit could be running two builds in two workers. But I do not know if it would really help.

zhangyx1998 commented 4 months ago

I figured out that the creation of workers (i.e. new threads) is also very expensive.

For example, the section splitter in local search took 4min 40s if it forks a worker for each page, but will take only 10s if it only forks N workers and reuse them among different pages. (that's insane compared to the 4 hour 40min build before this)

So the new plan is to create N workers upon resolution of siteConfig, each of them initialized with a copy of the config. And then use an API to dispatch different types of task to them. For example, building client/server bundle, indexing pages, and rendering SSR results.

I've got a working version in my local workspace, and will clean it up for review shortly.

zhangyx1998 commented 4 months ago

@brc-dd Moving Vite build() to workers produces the following error:

⠏ building client + server bundles...
[vite]: Rollup failed to resolve import "vitepress" from "/Users/Yuxuan/Lab/vitepress/dist/client/app/components/Content.js".
This is most likely unintended because it can break your application at runtime.
If you do want to externalize this module explicitly add it to
`build.rollupOptions.external`

The code looks like this (build/bundle.ts):

import { build /* ... */ } from 'vite'
import { registerWorkload, dispatchWork } from '../worker'

registerWorkload('vite:build', (config: ViteInlineConfig) => build(config))

export async function bundle(...) {
  /* ... */
  let [serverResult, clientResult]: [
    Rollup.RollupOutput,
    Rollup.RollupOutput | null
  ] = await task('building client + server bundles', () =>
    Promise.all([
      resolveViteConfig(true).then(
        config.parallel ? (cfg) => dispatchWork('vite:build', cfg) : build
      ) as Promise<Rollup.RollupOutput>,
      config.mpa
        ? null
        : (resolveViteConfig(false).then(
            config.parallel ? (cfg) => dispatchWork('vite:build', cfg) : build
          ) as Promise<Rollup.RollupOutput>)
    ])
  )
}

This looks weird to me, it seems like build has some external state which is not included in its argument (ViteInlineConfig), but was initialized with other exported functions (all functions inside cfg are proxied back to main, and should not be the cause of this problem). Can you leave some comment on this?

Another approach would be only making createMarkdownToVueRenderFn parallel, which should be helpful enough. Regarding LRU cache, I would say we consider discard it after we refactored the md2vue renderer, since the lookup time might not worth it if rendering became very fast. 👀

zhangyx1998 commented 4 months ago

Never mind, I found the cause: a global variable is used to pass around site config.

Not exactly the cause of the problem above, I've done many other tricks to make vite happy in a worker thread.

zhangyx1998 commented 4 months ago

@brc-dd I've reverted changes related to parallel bundling. Everything else is stable and ready for review.

Regarding parallel bundling, I've tried several different approaches (as you can see in the commit history), but each of them has their own problems.

  1. Re-importing user config in every worker is not an option, because user config may have side effects or internal states. Therefore, userConfig can only be imported once in the main thread and be exposed to workers by rpc-magic-proxy.

  2. markdown-it does not support async hooks, but all functions in user-config, including user defined markdown hooks, will be converted to async (as explained in 1). This will break markdown-it when run in worker threads with user provided hooks. Therefore, markdown-it can only live in main thread.

  3. Vite/Rollup has some issue resolving node imports in worker threads. Although this can be patched using a helper plugin, I am not sure if the patch actually fixed the problem or just hide it away.

I've developed some tricks to get around the problems above, but those tricks are hacky and unstable. Also, forcing markdown renderer to stay in main thread will produce heavy RPC overhead, making it even slower than single thread.

I think it's better to create another PR dedicated for parallel bundling, when there is actually someone complaining about bundling performance.

zhangyx1998 commented 4 months ago

BTW, local search indexing now only takes about 10 seconds for 1700 pages. It previously took 4.7 hours.

Machineric commented 2 weeks ago

This PR looks very promising. Any updates?

zhangyx1998 commented 3 days ago

It's been a while since this PR was created. Some problems remain unsolved due to limitations of true multithreading. That said, I am still awaiting reviews from maintainers to see if at least some of its features could be merged.