vercel / turborepo

Build system optimized for JavaScript and TypeScript, written in Rust
https://turbo.build/repo/docs
MIT License
26.02k stars 1.79k forks source link

Slow performance at scale with larger project #7515

Closed jakeleventhal closed 6 months ago

jakeleventhal commented 6 months ago

Verify canary release

Link to code that reproduces this issue

A bit tricky given the nature of this being an issue in a large repo

What package manager are you using / does the bug impact?

pnpm

What operating system are you using?

Mac

Which canary version will you have in your reproduction?

Turborepo v1.12.5-canary.0

Describe the Bug

My repo structure is like this:

apps/
   app1Name/
      package.json
      api/
      client/
      lambda/
   app2Name/
      package.json
      api/
      client/
      lambda/
   app3Name/
      package.json
      api/
      client/
      lambda/
packages/
   app1Name/
      package1/
      package2/
      package3/
   app2Name/
      package1/
      package2/
      package3/
   app3Name/
      package1/
      package2/
      package3/

My pnpm-workspace.yaml file looks like this:

packages:
  - 'apps/*/*'
  - 'packages/*/*'

Note that in my repo, only the app subfolders have a package.json (used for scripts like starting multiple apps in dev servers etc. If I run turbo tsc from packages/app1Name, it will run tsc for all subpackages. If I run turbo tsc from apps/app1Name, I get this error:

  × could not resolve workspaces: We did not find a package manager specified in your root package.json. Please set the "packageManager" property in your root package.json (https://nodejs.org/api/
  │ packages.html#packagemanager) or run `npx @turbo/codemod add-package-manager` in the root of your monorepo.
  ╰─▶ We did not find a package manager specified in your root package.json. Please set the "packageManager" property in your root package.json (https://nodejs.org/api/packages.html#packagemanager) or run `npx
      @turbo/codemod add-package-manager` in the root of your monorepo.

This is likely due to the package.json in the apps/app1Name/ folder. If I modify my pnpm-workspace.yaml file to include:

- 'apps/app1Name'
- 'apps/app2Name'
- 'apps/app3Name'
- 'apps/*/*'
- 'packages/*/*'

this then works and I get what I am looking for. However, the startup time to query the graph (time between starting turbo tsc and output starting to log) goes from about 1s to 7s (huge slowdown in performance).

Expected Behavior

This is faster (or ideally there is a better way to set up my workspace file?

To Reproduce

See instructions

Additional context

No response

anthonyshew commented 6 months ago

If you could provide us with traces, that would be phenomenal. To do so, run the same commands with --profile=profile.json -vvv at the end: https://turbo.build/repo/docs/reference/command-line-reference/run#--profile

You'll get a file called profile.json in the root of your repository. Everything is anonymized so you can attach that file to your comment here or, if you'd prefer, send it to me at anthony.shew@vercel.com and I can forward it to the rest of the team.

jakeleventhal commented 6 months ago

Sent to email, thanks

chris-olszewski commented 6 months ago

Would you mind sharing your turbo.json(s) or specifically the definition for tsc?

jakeleventhal commented 6 months ago
{
  "$schema": "https://turbo.build/schema.json",
  "pipeline": {
    "//#package-deps": {
      "inputs": [
        ".npmignore",
        ".nvmrc",
        "apps/*/*/package.json",
        "apps/*/package.json",
        "packages/*/*/package.json",
        "pnpm-lock.yaml",
        "pnpm-workspace.yaml",
        "tools/*/package.json"
      ],
      "outputMode": "new-only",
      "outputs": ["pnpm-lock.yaml"]
    },
    "tsc": {
      "dependsOn": ["//#package-deps", "^tsc"],
      "inputs": [
        "!.next/**",
        "!.vercel/**",
        "!coverage/**",
        "!dist/**",
        "!next-env.d.ts",
        "!node_modules/**",
        ".env*",
        "**/*.js",
        "**/*.jsx",
        "**/*.ts",
        "**/*.tsx",
        "**/*.ttf",
        "**/*.svg",
        "**/*.png",
        "**/*.jp*g",
        "**/*.webp",
        "**/dockerfiles/**",
        "tsconfig*.json"
      ],
      "outputMode": "new-only",
      "outputs": ["dist/**"]
    }
  }
}
jakeleventhal commented 6 months ago

tsc is just an npm script in each of myapps/packages that runs tsc

jakeleventhal commented 6 months ago

I suppose the problem is that checking the deps takes a lot longer, but really there isn't any "source code" in my apps/app1 folder. It's just an organizational folder really for other packages.

chris-olszewski commented 6 months ago

Discussed over email, but syncing here.

Issue was with one package's structure where the exclusion globs of !.verce/ weren't excluding as expected due to the package having subdirectories e.g. package-a/package-1/.vercel. This caused 14.5k files to be considered as task inputs. Was fixed by updating the exclusion globs to match the structure of this package.