vercel / turborepo

Build system optimized for JavaScript and TypeScript, written in Rust
https://turbo.build/repo/docs
MIT License
26.21k stars 1.81k forks source link

Turbo runs fewer tasks than it should and reports questionable summaries #8088

Closed JavaScriptBach closed 5 months ago

JavaScriptBach commented 5 months ago

Verify canary release

Link to code that reproduces this issue

N/A

What package manager are you using / does the bug impact?

Yarn v2/v3/v4 (node_modules linker only)

What operating system are you using?

Linux

Which canary version will you have in your reproduction?

1.13.2

Describe the Bug

I'm using Turbo in my CI pipeline and I just had a build where Turbo ran 50 fewer tasks than it was supposed to, reported some questionable numbers, and exited successfully.

Command I ran: npx turbo run typecheck knip lint circular --continue --concurrency=16 --cache-dir=.turbo-cache

The output I got:

• Packages in scope: <List of 61 packages, omitted for brevity>
• Running typecheck, knip, lint, circular in 61 packages
• Remote caching disabled
<Output of 162 tasks, omitted for brevity>

 Tasks:    71 successful, 162 total
Cached:    23 cached, 162 total
  Time:    1m9.917s 

Expected Behavior

When Turbo is behaving correctly, it gives a summary like this:

• Packages in scope: <List of 61 packages, omitted for brevity>
• Running typecheck, knip, lint, circular in 61 packages
• Remote caching disabled
<Output of 212 tasks, omitted for brevity>

 Tasks:    212 successful, 212 total
Cached:    32 cached, 212 total
  Time:    11m41.11s 

To Reproduce

None yet, but I can upload a summary the next time it happens. Let me know if there's anything else that would be useful to include.

Additional context

No response

Isokaeder commented 5 months ago

I just noticed a similar issue in our project.

In my case:

OS

MacOS

package manager

yarn v1

canary version

v1.13.4-canary.0

Description:

I only have our actual repo and branch where the issue occurred. It is not a minimal example by any means, but won't have time soon to come up with one.

On the branch in question a eslint-config project is introduced. It has a turbo.json as all of our packages. It looks like this:

{
    "$schema": "https://turbo.build/schema.json",
    "extends": ["//"],
    "pipeline": {
        "build": {
            "inputs": ["./scripts/**", "./source/**", "tsconfig.*"],
            "outputMode": "new-only",
            "outputs": ["./dist/**"]
        }
    }
}

However, changes to ./source/index.ts would not cause a cache miss, the package did not build without passing a --force flag.

After investigating this a little bit, the one noteworthy thing is that I ran yarn run turbo run build --filter=eslint-config --summarize before and after changing the file (by commenting out notable chunks of the code). For both json files the hashes for index.ts file are the same. That reads to me that for whatever reason turbo did not pick up on the file change.


If I can help in any way to investigate this further, I am glad to do so :-)

Isokaeder commented 5 months ago

My issue seems to be fixed by omitting all leading ./s for globs in the config file. These are my changes:

{
    "$schema": "https://turbo.build/schema.json",
    "extends": ["//"],
    "pipeline": {
        "build": {
-           "inputs": ["./scripts/**", "./source/**", "tsconfig.*"],
+           "inputs": ["scripts/**", "source/**", "tsconfig.*"],
            "outputMode": "new-only",
-           "outputs": ["./dist/**"]
+           "outputs": ["dist/**"]
        }
    }
}

Here is the commit

That being said, it is weird that the summaries list the same files before and after this change.


I should add that the only reason I tried this is that I noticed all globs in the config are formatted like this. IMO this goes against the intuition that the docs transport when they state:

inputs globs must be specified as relative paths rooted at the package's directory.

anthonyshew commented 5 months ago

@Isokaeder Looking at the repo you've posted, it appears the @3yourmind/eslint-config is missing in the dependency list for other packages. For instance, I don't see it here or here. Without those dependency relationships, Turborepo won't know that it should miss cache since those packages don't have any relationship with the ESLint configuration package.

There also could be a genuine bug here from the original report but there's no reproduction for us to look at there. But, as far as the repro that's linked from @Isokaeder goes, I'm not sure that we have a genuine globbing bug based on what I'm seeing so far.

Re: The docs globbing paths description @Isokaeder has quoted above

That does look to be unclear when thinking about Package Configurations and I'll give that more clarity in our upcoming docs rewrite. Thank you for mentioning that. 👍

FlorianWendelborn commented 5 months ago

@anthonyshew Thanks for looking into this. This package isn’t really a dependency of the other packages per se. We basically use it like this for our yarn run/turbo run check:eslint task: https://github.com/3YOURMIND/kotti/blob/1930ed35f838feecc39f880d4f22892ca40d30c3/turbo.json#L18-L23

For this task, turbo also correctly resolves the dependencies:

$ turbo run check:eslint --graph
digraph {
        compound = "true"
        newrank = "true"
        subgraph "root" {
                "[root] @3yourmind/documentation#check:eslint" -> "[root] @3yourmind/eslint-config#build"
                "[root] @3yourmind/documentation#check:eslint" -> "[root] @3yourmind/kotti-ui#build"
                "[root] @3yourmind/documentation#check:eslint" -> "[root] @3yourmind/yoco#build"
                "[root] @3yourmind/eslint-config#build" -> "[root] ___ROOT___"
                "[root] @3yourmind/eslint-config#check:eslint" -> "[root] @3yourmind/eslint-config#build"
                "[root] @3yourmind/kotti-ui#build" -> "[root] @3yourmind/kotti-ui#build:tokens"
                "[root] @3yourmind/kotti-ui#build" -> "[root] @3yourmind/kotti-ui#build:vite"
                "[root] @3yourmind/kotti-ui#build" -> "[root] @3yourmind/kotti-ui#build:vue-tsc"
                "[root] @3yourmind/kotti-ui#build" -> "[root] @3yourmind/vue-use-tippy#build"
                "[root] @3yourmind/kotti-ui#build" -> "[root] @3yourmind/yoco#build"
                "[root] @3yourmind/kotti-ui#build:tokens" -> "[root] ___ROOT___"
                "[root] @3yourmind/kotti-ui#build:vite" -> "[root] @3yourmind/vue-use-tippy#build"
                "[root] @3yourmind/kotti-ui#build:vite" -> "[root] @3yourmind/yoco#build"
                "[root] @3yourmind/kotti-ui#build:vue-tsc" -> "[root] @3yourmind/vue-use-tippy#build"
                "[root] @3yourmind/kotti-ui#build:vue-tsc" -> "[root] @3yourmind/yoco#build"
                "[root] @3yourmind/kotti-ui#check:eslint" -> "[root] @3yourmind/eslint-config#build"
                "[root] @3yourmind/kotti-ui#check:eslint" -> "[root] @3yourmind/vue-use-tippy#build"
                "[root] @3yourmind/kotti-ui#check:eslint" -> "[root] @3yourmind/yoco#build"
                "[root] @3yourmind/vue-use-tippy#build" -> "[root] ___ROOT___"
                "[root] @3yourmind/vue-use-tippy#check:eslint" -> "[root] @3yourmind/eslint-config#build"
                "[root] @3yourmind/yoco#build" -> "[root] ___ROOT___"
                "[root] @3yourmind/yoco#check:eslint" -> "[root] @3yourmind/eslint-config#build"
                "[root] root#check:eslint" -> "[root] @3yourmind/eslint-config#build"
        }
}

graphviz


Here I am on the commit before our switch away from ./ (https://github.com/3YOURMIND/kotti/commit/1930ed35f838feecc39f880d4f22892ca40d30c3):

image

As you can see, it did not understand that the eslint-config changed (I added a comment at the end) and assumed it’s supposed to be cached/already on disk

Switching back to the version without ./ it works as expected:

image

JavaScriptBach commented 5 months ago

My issue is separate from @Isokaeder and @FlorianWendelborn's, which seems to be about caching. Can we move that discussion into a separate thread?

The issue that I'm running into is that Turbo sometimes runs fewer tasks than it should. It's not a question of caching; those tasks just aren't even run, and no output from them is listed.

JavaScriptBach commented 5 months ago

@anthonyshew OK, I got a Turbo run summary from a failed run. Most of the task IDs are sensitive because they contain the names of our services, but here's a small snippet to help illustrate the problem: sanitized.json

In it, we can see that

  1. I ran turbo run typecheck find-unused lint circular --continue, so I expect this to run the typecheck job in every project.
  2. server-common#typecheck was listed in the dependents field of several jobs that were ran.
  3. server-common#typecheck itself was never ran. It never shows up as a taskId in the summary.

Looking at the execution field, I also see interesting things:

  "execution": {
    "command": "turbo run typecheck find-unused lint circular --continue",
    "repoPath": "",
    "success": 103,
    "failed": 0,
    "cached": 32,
    "attempted": 165,
    "startTime": 1715029673168,
    "endTime": 1715029831790,
    "exitCode": 0
  },
server-common:lint
  cache miss, executing 8140116d8569dc58

  Tasks:    135 successful, 165 total
 Cached:    32 cached, 165 total
   Time:    2m38.622s 

However, server-common#lint was also not one of the tasks run in the summary.

To summarize, it seems like there are 2 issues:

  1. Turbo is not finding all the tasks to run (e.g. server-common#typecheck). This results in it reporting an incorrect number for attempted.
  2. Turbo claims to run some tasks, but exits successfully before running them (e.g. server-common#lint). This results in success + failed + cached being less than attempted.

Let me know 1) if you agree that this is a bug in Turbo and 2) if more information would help to triage and fix this.

JavaScriptBach commented 5 months ago

Re-opening, closed by accident.

Isokaeder commented 5 months ago

@JavaScriptBach just my two cents as an external, I looked into this a little bit out of curiosity and it seems that the numbers not adding up is somewhat expected.

Look at this unit test that basically tests that those numbers are created correctly.

So I would assume that the tasks that are not run are cancelled for some reason, a "turbo internal reason", perhaps. In case you can reproduce this reliably, maybe looking at internal logs via --verbosity will be interesting.

JavaScriptBach commented 5 months ago

I got a debug log of a failure and I see these lines at the end, suggesting that Turbo is interrupting jobs instead of waiting for them to finish:

2024-05-07T22:12:12.966+0000 [DEBUG] turborepo_lib::process::child: waiting for child 12207
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: stopping child process
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: starting shutdown
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: sending SIGINT to child 10417
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: waiting for child 10417
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: stopping child process
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: starting shutdown
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: sending SIGINT to child 11262
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: stopping child process
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: starting shutdown
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: sending SIGINT to child 11817
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: waiting for child 11817
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: stopping child process
2024-05-07T22:12:12.967+0000 [DEBUG] turborepo_lib::process::child: starting shutdown

Under what circumstances is Turbo expected to do this?

anthonyshew commented 5 months ago

@FlorianWendelborn @Isokaeder The graph you've provided shows what I'm describing. You have a misconfiguration since your ESLint configuration package is not a part of the dependency tree for the rest of the packages you want to invalidate cache for. If you'd like, please make a Discussion and the community may be able to provide some help.

@JavaScriptBach That's looking suspicious. I'm interested to see what more we can learn. A couple questions as we start looking at this:

JavaScriptBach commented 5 months ago

@anthonyshew I'm using Yarn 4.2.1. I'll work on getting a repro in canary.

edit: I was able to reproduce this issue in Turbo 1.13.3, though.

chris-olszewski commented 5 months ago

@JavaScriptBach could you try 1.13.4-canary.2 and see if the problem persists?

JavaScriptBach commented 5 months ago

Thanks @chris-olszewski. I installed this on my CI, and a small percentage of the time, I now get:

 x internal errors encountered: external process killed a task

Followed by Turbo exiting unsuccessfully.

chris-olszewski commented 5 months ago

@JavaScriptBach

Okay, this indicates that something outside of turbo sent a kill signal to one of the child processes. Previously, I've seen this when there's some OOM killer on a machine that kills processes that use up too much resources.

I'm sorry I can't offer much more help for debugging, but we don't get any information from the process exit status about what killed it.

chris-olszewski commented 5 months ago

Closing as original issue @JavaScriptBach brought up has been resolved.

@Isokaeder Please open a new issue for your problem if needed.