facebook / docusaurus

Easy to maintain open source documentation websites.
https://docusaurus.io
MIT License
56.79k stars 8.55k forks source link

Build fails on MacOS for large site because max proc limit is exceeded #10348

Open t1m0thyj opened 3 months ago

t1m0thyj commented 3 months ago

Have you read the Contributing Guidelines on issues?

Prerequisites

Description

After updating from Docusaurus v3.1 to v3.4, building a large site fails with the following error on MacOS:

[ERROR] Error: Unable to build website for locale en.
    at tryToBuildLocale (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/commands/build.js:54:19)
    at async /Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/commands/build.js:65:9
    at async mapAsyncSequential (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/jsUtils.js:20:24)
    at async Command.build (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/commands/build.js:63:5) {
  [cause]: Error: Can't process doc metadata for doc at path path=/Users/timothy/Projects/zowe/docs-site/versioned_docs/version-v2.11.x/user-guide/cli-using-formatting-environment-variables.md in version name=v2.11.x
      at processDocMetadata (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/docs.js:146:15)
      at async Promise.all (index 85)
      ... 4 lines matching cause stack trace ...
      at async /Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/server/plugins/plugins.js:38:23
      at async Promise.all (index 0)
      at async /Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/server/plugins/plugins.js:139:25
      at async loadSite (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/server/site.js:135:45) {
    [cause]: Error: An error occurred when trying to get the last update date
        at getGitLastUpdate (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/lastUpdateUtils.js:43:19)
        at async readLastUpdateData (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/lastUpdateUtils.js:80:36)
        at async doProcessDocMetadata (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/docs.js:48:24)
        at async processDocMetadata (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/docs.js:143:16)
        at async Promise.all (index 85)
        at async doLoadVersion (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/index.js:121:34)
        at async loadVersion (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/index.js:162:28)
        at async Promise.all (index 6)
        at async Object.loadContent (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/index.js:170:33)
        at async /Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/server/plugins/plugins.js:38:23 {
      [cause]: Error [ShellJSInternalError]: spawn EBADF
          at ChildProcess.spawn (node:internal/child_process:421:11)
          at spawn (node:child_process:761:9)
          at Object.execFile (node:child_process:351:17)
          at Object.exec (node:child_process:234:25)
          at execAsync (/Users/timothy/Projects/zowe/docs-site/node_modules/shelljs/src/exec.js:136:17)
          at Object._exec (/Users/timothy/Projects/zowe/docs-site/node_modules/shelljs/src/exec.js:221:12)
          at Object.exec (/Users/timothy/Projects/zowe/docs-site/node_modules/shelljs/src/common.js:335:23)
          at result (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/gitUtils.js:46:27)
          at new Promise (<anonymous>)
          at getFileCommitDate (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/gitUtils.js:45:26) {
        errno: -9,
        code: 'EBADF',
        syscall: 'spawn'
      }
    }
  }
}

Reproducible demo

No response

Steps to reproduce

  1. Upgrade to Docusaurus v3.2 or later (e.g. https://github.com/zowe/docs-site/pull/3785/files#diff-7ae45ad102eab3b6d7e7896acd08c427a9b25b346470d7bc6507b6481575d519)
  2. Run docusaurus build

Expected behavior

The build succeeds

Actual behavior

There is an EBADF error like the one above. This works on other OS's like Ubuntu, so it seems to be related to the max number of processes supported by MacOS.

I was able to work around the issue by replacing the following line to limit the number of concurrent git processes to 100: https://github.com/facebook/docusaurus/blob/95990c6105fb7aea6f840cd5f38905da3a528b65/packages/docusaurus-plugin-content-docs/src/index.ts#L194

  const results = [];
  while (docFiles.length) {
    results.push(...await Promise.all(docFiles.splice(0, 100).map(processVersionDoc)));
  }
  return results;

I'm willing to open a PR if the fix is simple, but not sure what the preferred solution would be - is a hardcoded limit like 100 ok, or should it be configurable in the Docusaurus config?

Your environment

Self-service

Josh-Cena commented 3 months ago

I think a hardcoded limit is ok, and at best we can have an env variable to customize.

slorber commented 3 months ago

Left my initial review here: https://github.com/facebook/docusaurus/pull/10354#pullrequestreview-2207965245

To be honest, although the proposed solution might fix the problem, and we might still want to limit git concurrency, I believe it only hides the real bug here: shelljs (unmaintained deps) probably doesn't handle file descriptors well under concurrent access?

I'd like to explore switching to alternatives first, before introducing IO queueing:

Josh-Cena commented 3 months ago

I agree to that, let's get rid of shelljs completely for execa?

However I don't know why file descriptors are related. The problem here is hitting the process limit, not the FD limit, no?

slorber commented 3 months ago

I'm not 100% sure but it seems "EBADF" means "Error, bad file descriptor" in Node.js

I propose that we first remove shelljs and see if the problem disappears by testing upgrading the zowe site to a canary?

t1m0thyj commented 3 months ago

I'm not 100% sure but it seems "EBADF" means "Error, bad file descriptor" in Node.js

I propose that we first remove shelljs and see if the problem disappears by testing upgrading the zowe site to a canary?

If shelljs is unmaintained then I'm all for getting rid of it. Although the error code does seem related to file descriptors, I don't think the max file descriptor limit is being hit because I tried increasing it with ulimit.

Regardless of what dependency is being used to spawn processes, IMO it's not good practice to launch hundreds or potentially thousands of processes at once, without a promise queue in place to limit the number of concurrent processes.

slorber commented 3 months ago

I'm not 100% sure but it seems "EBADF" means "Error, bad file descriptor" in Node.js I propose that we first remove shelljs and see if the problem disappears by testing upgrading the zowe site to a canary?

If shelljs is unmaintained then I'm all for getting rid of it. Although the error code does seem related to file descriptors, I don't think the max file descriptor limit is being hit because I tried increasing it with ulimit.

Thanks, was about to ask.

Regardless of what dependency is being used to spawn processes, IMO it's not good practice to launch hundreds or potentially thousands of processes at once, without a promise queue in place to limit the number of concurrent processes.

Agree, but this is a general problem we have, not just limited to Git commands but to all IOs in general.


I tried building your branch locally https://github.com/zowe/docs-site/pull/3785

I got many warnings (broken links, admonitions) but was able to build without such EBADF error (I'm on MacOS M3 Sonoma)

Only you will be able to confirm it moving to Execa fixes it. You can try to apply this change locally:

https://github.com/facebook/docusaurus/pull/10358/files#diff-cb75564637c0cca6a5c3d3eceb846ae064def7c90424de1df0bd110c3fc23b14R133

t1m0thyj commented 3 months ago

I tried building your branch locally zowe/docs-site#3785

I got many warnings (broken links, admonitions) but was able to build without such EBADF error (I'm on MacOS M3 Sonoma)

Only you will be able to confirm it moving to Execa fixes it. You can try to apply this change locally:

https://github.com/facebook/docusaurus/pull/10358/files#diff-cb75564637c0cca6a5c3d3eceb846ae064def7c90424de1df0bd110c3fc23b14R133

We archived some old doc versions to reduce the number of files in the repo and work around the issue. I'll test the changes next week on a branch that still has the old files.

Thanks for the reminder about broken links, there is WIP to fix them 🙂