vercel / turborepo

Build system optimized for JavaScript and TypeScript, written in Rust
https://turbo.build/repo/docs
MIT License
26k stars 1.79k forks source link

discovery failed: bad grpc status code: The operation was cancelled #8491

Open dBianchii opened 3 months ago

dBianchii commented 3 months ago

Verify canary release

Link to code that reproduces this issue

https://github.com/dBianchii/kodix-turbo

What package manager are you using / does the bug impact?

pnpm

What operating system are you using?

Linux

Which canary version will you have in your reproduction?

turbo 2.0.3

Describe the Bug

When trying to run turbo watch dev -F @kdx/kdx... for my project, I get this error:

gabriel@gabriel-ubuntu:~/Documents/Github/kodix-turbo$ turbo watch dev -F @kdx/kdx...
• Packages in scope: @kdx/api, @kdx/auth, @kdx/date-fns, @kdx/dayjs, @kdx/db, @kdx/eslint-config, @kdx/kdx, @kdx/locales, @kdx/prettier-config, @kdx/react-email, @kdx/shared, @kdx/tailwind-config, @kdx/tsconfig, @kdx/ui, @kdx/upstash-dev, @kdx/validators
• Running dev in 16 packages
• Remote caching disabled
  × failed to connect to daemon
  ╰─▶ server is unavailable: channel closed

Expected Behavior

Should run the project in watch mode without any problems

To Reproduce

Clone https://github.com/dBianchii/kodix-turbo and run pnpm dev:kdx after running pnpm i

Additional context

I tested it on 2.0.4-canary.4

dBianchii commented 3 months ago

I also get this error sometimes:

× failed to connect to daemon
  ╰─▶ server is unavailable: channel closed

But after runing turbo daemon clean the error goes back to "bad grpc status code"

aaronadamsCA commented 2 months ago

Same version, same issue:

$ pnpm exec turbo watch typecheck lint:fix
• Packages in scope: //, ...
• Running typecheck, lint:fix in 27 packages
• Remote caching enabled
  × discovery failed: bad grpc status code: The operation was cancelled

Apparently it's not reliably reproducible, as I'm currently running watch mode without issue. I have no idea what changed.

NicholasLYang commented 2 months ago

Interesting. Could you run watch mode with debug verbosity, i.e. add -vv to the invocation? And also if you could send over your daemon logs (run turbo daemon logs), that'd be helpful too!

ericanderson commented 2 months ago

I see the same thing in my repo. Some notes:

Hope this helps. Thanks for your hard work

ericanderson commented 2 months ago

Also with many -v's:

2024-06-18T10:19:58.448-0400 [DEBUG] turborepo_lib::daemon::connector: looking for pid in lockfile: AbsoluteSystemPathBuf("/var/folders/qx/[redact]/T/turbod/[redact]/turbod.pid")
                                             2024-06-18T10:19:58.453-0400 [TRACE] log: deregistering event source from poller
                                                                                                                               × failed to connect to daemon
  ╰─▶ server is unavailable: channel closed
ericanderson commented 2 months ago

I tried with the Kodix repo from the OP and it worked on the same drive as /var and failed on my dev drive. Thats your repro I think.

gegenschall commented 2 months ago

I have the very same issue with watch. I'm running turbo inside a Docker container. Logs I can see:

2024-06-21T10:53:34.000+0000 [DEBUG] turborepo_lib::daemon::connector: looking for pid in lockfile: AbsoluteSystemPathBuf("/tmp/turbod/f53b52ad6d21cceb/turbod.pid")
2024-06-21T10:53:34.000+0000 [DEBUG] turborepo_lib::daemon::connector: found pid: 55
2024-06-21T10:53:34.000+0000 [DEBUG] turborepo_lib::daemon::connector: got daemon with pid: 55
2024-06-21T10:53:34.001+0000 [DEBUG] turborepo_lib::daemon::connector: creating AbsoluteSystemPath("/tmp/turbod/f53b52ad6d21cceb")
2024-06-21T10:53:34.001+0000 [DEBUG] turborepo_lib::daemon::connector: watching AbsoluteSystemPath("/tmp/turbod/f53b52ad6d21cceb")
2024-06-21T10:53:34.001+0000 [DEBUG] turborepo_lib::daemon::connector: creating AbsoluteSystemPath("/tmp/turbod/f53b52ad6d21cceb")
2024-06-21T10:53:34.001+0000 [DEBUG] turborepo_lib::daemon::connector: watching AbsoluteSystemPath("/tmp/turbod/f53b52ad6d21cceb")
2024-06-21T10:53:34.001+0000 [DEBUG] turborepo_lib::daemon::connector: connecting to socket: /tmp/turbod/f53b52ad6d21cceb/turbod.sock
2024-06-21T10:53:34.001+0000 [DEBUG] turborepo_lib::daemon::connector: socket error: transport error
Cypher1 commented 2 months ago

I'll add that even with --no-daemon the channels are always acting 'closed' for me

turbo --no-daemon watch typecheck
• Packages in scope: @<org>/<pkg1>, @<org>/<pkg2>, @<org>/<pkg3>, @<org>/<pkg4>, @<org>/<pkg5>
• Running typecheck in 5 packages
• Remote caching disabled
  × failed to connect to daemon
  ╰─▶ server is unavailable: channel closed
   turbo daemon logs    
2024-06-22T04:05:32.604752Z ERROR turborepo_lib::daemon::server: package changes stream closed: channel closed
2024-06-22T04:27:16.227724Z  WARN turborepo_filewatch: encountered error watching filesystem enumerating recursive watch: IO error for operation on /home/jaypratt/Code/skfl/.git/.gitstatus.kKDIEm/a/1: No such file or directory (os error 2)
2024-06-22T04:27:18.363161Z ERROR turborepo_lib::daemon::server: package changes stream closed: channel closed
2024-06-22T04:27:40.715940Z ERROR turborepo_lib::daemon::server: package changes stream closed: channel closed
2024-06-22T04:27:51.961664Z ERROR turborepo_lib::daemon::server: package changes stream closed: channel closed

And then, if I manually clean

❯ turbo daemon clean                 
Done
❯ turbo watch typecheck            
• Packages in scope: ...
• Running typecheck in 5 packages
• Remote caching disabled
  × discovery failed: bad grpc status code: The operation
  │ was cancelled

Fwiw, non-watch commands succeed

❯ turbo typecheck
....successful logs....

❯ echo "$?"                            
0
Cypher1 commented 2 months ago

Okay! Progress:

It looks like turbo watch <job> is quitting if any of the jobs fail!

So, if I add extra jobs of the form

// in package.json
"typecheckTrue": "npm run typecheck || true",
// In turbo.json
"typecheckTrue": { ... copy of the definition for "typecheck" },

I get the expected behaviour! (except that I get ticks for every item in the watch view)

aaronadamsCA commented 2 months ago

I believe I may have now worked around the issue by installing the Turborepo LSP extension for Visual Studio Code. I can't say for sure that's why turbo watch has begun to work reliably for me, but it seems like the most likely answer, and worth a shot for anyone else who uses Code.

Mati365 commented 2 months ago

I have a similar issue, but it always happens after the first run of watch command. I'm unable to reproduce that on second run, for example.

NicholasLYang commented 2 months ago

Gotcha. I'll try to reproduce and fix these issues. I will note that turbo watch is not compatible with --no-daemon, since we use the daemon to watch your files and produce change events.

If I have it correct, there's a few issues at hand here:

Is that an accurate summary?

ericanderson commented 2 months ago

For mine, I am running on MacOS. When I said my "dev" drive I meant my literal other drive/partition that is mounted at /Volumes/git. For me, this is important because its case-sensitive (and a faster volume than my normal drive). I am not having problems if I run from my home directory, which is on the same volume/partition as /var. I get the errors when running from /Volumes/git which is a different partition/volume/mount point.

uncvrd commented 2 months ago

I've run in to this fairly reliably (not consistent enough for repro though) when i leave the turbo watch [cmds] running in my terminal and then my computer goes to sleep. If I wake up my computer and try to make a change to my code I receive

unable to connect to daemon after 3 retries

My command is turbo watch dev pkg-dev

Which does the following in my monorepo:

If I try to run my turbo watch [cmd] again after this fails, all my processes in the terminal just hang. I have to go to my activity monitor where i normally see 2 node processes still active that I have to force quit. I then can run my command again and it works

The turbo daemon logs don't seem too helpful at the moment but here you go

2024-06-25T08:23:41.397334Z  WARN turborepo_lib::package_changes_watcher: hashes are the same, no need to rerun
2024-06-25T08:23:41.551849Z  WARN turborepo_lib::package_changes_watcher: changed_files: {AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/20.pack.gz_")}
2024-06-25T08:23:41.551857Z  WARN turborepo_lib::package_changes_watcher: changed_packages: Ok(Some({WorkspacePackage { name: Other("foundry-client"), path: AnchoredSystemPathBuf("apps/web") }}))
2024-06-25T08:23:41.551987Z  WARN turborepo_lib::package_changes_watcher: hashes are the same, no need to rerun
2024-06-25T08:23:42.552668Z  WARN turborepo_lib::package_changes_watcher: changed_files: {AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/20.pack.gz"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/8.pack.gz"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/12.pack.gz_"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/8.pack.gz_"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/index.pack.gz.old"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/index.pack.gz"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/20.pack.gz_"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/5.pack.gz_"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/0.pack.gz"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/0.pack.gz_"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/12.pack.gz"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/index.pack.gz_"), AnchoredSystemPathBuf("apps/web/.next/cache/webpack/client-development/5.pack.gz")}
2024-06-25T08:23:42.552677Z  WARN turborepo_lib::package_changes_watcher: changed_packages: Ok(Some({WorkspacePackage { name: Other("foundry-client"), path: AnchoredSystemPathBuf("apps/web") }}))
2024-06-25T08:23:42.565069Z  WARN turborepo_lib::package_changes_watcher: hashes are the same, no need to rerun
2024-06-25T17:14:53.505513Z  WARN turborepo_lib::package_changes_watcher: changed_files: {AnchoredSystemPathBuf("packages/trpc/src/scripts/migrate.ts")}
2024-06-25T17:14:53.505556Z  WARN turborepo_lib::package_changes_watcher: changed_packages: Ok(Some({WorkspacePackage { name: Other("@foundrydev/trpc"), path: AnchoredSystemPathBuf("packages/trpc") }}))
2024-06-25T17:14:53.954880Z ERROR turborepo_lib::daemon::server: package changes stream closed: channel closed

Hopefully this helps a little bit

warjiang commented 2 months ago

turbo daemon clean

@Cypher1 You're right, execute npx turbo daemon logs, I also find the similar logs: image

And execute npx turbo daemon clean to clean the daemon manully. After that ,the watch mode works~ 🚀

amarpatel commented 2 months ago

[...] npx turbo daemon clean to clean the daemon manully. After that ,the watch mode works~ 🚀

Just experienced this; running npx turbo daemon clean allowed me to start a turbo watch job again 🤝

NicholasLYang commented 2 months ago

@uncvrd, how long are you letting your computer sleep? We do have an inactivity timeout for the daemon, so if you're leaving watch mode open for multiple hours, it's expected that the daemon should stop.

uncvrd commented 2 months ago

@NicholasLYang it would be overnight. In V1 versions of turbo i could just leave the dev mode running, and continue where I left off the next morning.

I've updated to the latest version of turbo v2.0.6 and this morning I keep randomly getting

image

The logs from the corresponding file change show

2024-07-03T19:02:30.688061Z  WARN turborepo_lib::package_changes_watcher: changed_files: {AnchoredSystemPathBuf("apps/web/components/Table/FileNameCell.tsx")}
2024-07-03T19:02:30.688125Z  WARN turborepo_lib::package_changes_watcher: changed_packages: Ok(Some({WorkspacePackage { name: Other("foundry-client"), path: AnchoredSystemPathBuf("apps/web") }}))
2024-07-03T19:02:32.986589Z  WARN turborepo_lib::package_changes_watcher: changed_files: {AnchoredSystemPathBuf("apps/web/.next/trace")}
2024-07-03T19:02:32.986599Z  WARN turborepo_lib::package_changes_watcher: changed_packages: Ok(Some({WorkspacePackage { name: Other("foundry-client"), path: AnchoredSystemPathBuf("apps/web") }}))
2024-07-03T19:02:32.986796Z  WARN turborepo_lib::package_changes_watcher: hashes are the same, no need to rerun

Not sure if helpful though.

really wish I could understand what triggers it but it has happened twice in the last few hours. Normally my changes are hot reloaded when updating a file but other times i get this cancelled error. If I can reliably repro I'll share a codesandbox but nothing yet still...:/

This is the turbo dev command i run for my NextJS app:

"dev": {
            "cache": false,
            "env": ["VAULT_ADDR", "VAULT_ENV", "VAULT_TOKEN"],
            "persistent": true
        },

Which runs this npm script:

"dev": "NODE_OPTIONS=\"--max_old_space_size=6144\" next dev -H foundry-dev.ac",

Okay hypothesis that might help...maybe?

Since my turbo task is defined as persistent i find it interesting that the turbo daemon logs are recording that files are being changed. Shouldn't file changes be ignored with turbo watch if persistent is true?

NicholasLYang commented 2 months ago

So the way turbo watch works is that it separates out your persistent tasks and runs them normally. File changes are only used to restart non-persistent tasks. If you're only running persistent tasks, i.e. dev, then you might not need watch mode. Regular turbo run should work just fine.

Nonetheless, there does seem to be an issue with the daemon either crashing or ending up in an invalid state. Continuing to investigate that...

uncvrd commented 2 months ago

Hey @NicholasLYang thanks for the response! I have a previous comment above that mentions that I use tsup for watching my packages to rebuild so I need to use the watch

https://github.com/vercel/turbo/issues/8491#issuecomment-2189803724

No worries though let me know if there's anything i can test on my end to help

NicholasLYang commented 2 months ago

Ah gotcha, sorry for missing that. I'm investigating the watch issues right now. Do you happen to have an open source repository that I can use to test?

uncvrd commented 2 months ago

All good! Since I haven't been able to repro consistently, nothing to provide unfortunately. Sorry :/ If I can lock it down then yes, would be happy to provide a repro

JDaoMothership commented 1 month ago

+1 for this happening intermittently as well. Hard to get a true repro since it is pretty random 😭

dobrac commented 1 month ago

+1 from me as well. I don’t know how to reproduce it either :/ For now, I might just add the state when it happens:

knowlesc commented 1 month ago

+1 from us also. Sadly unable to find a way to reliably reproduce, but it is causing significant disruptions in our development workflow.

Similar constraints that @dobrac mentioned above. If there's anything else we find out I'll add here, but for now we aren't sure what to do other than just deal with it.

sanjeethboddi commented 1 month ago

Any updates on this?

nimeshmaharjan1 commented 1 month ago

man this is just boring

PatrykKuniczak commented 4 weeks ago

It's so freaking annoying, let's do sth with this, pls.

sttuartt commented 4 weeks ago

We are seeing this issue 100% of the time we run turbo watch dev on a Mac case-sensitive volume.

This only happens for us on a Mac case-sensitive volume.

If the repo is copied to a non case-sensitive volume, this error does not occur.

Looking forward to a patch for this.

ericanderson commented 4 weeks ago

My issue is a Mac on a case sensitive volume as well

PatrykKuniczak commented 4 weeks ago

I Have it on Windows 11

njbair commented 3 weeks ago

+1 on Mac on case-sensitive volume; works fine when copied to the system drive.

sanjeethboddi commented 3 weeks ago

+1 on Mac on case-sensitive volume; works fine when copied to the system drive.

polvallverdu commented 1 week ago

+1 Having same result on Windows with WSL, and on Linux (Linux Mint 22). Using the same project in both OS.


Edit: I managed to fix it. I'm using a docker compose file to spin up a db locally. I was saving the data on ./data in the same folder as the monorepo. After running docker compose up, everything broke. Running docker compose down -v to delete everything, included volumes, removing the entire folder (as root), cleaning the project (deleting all .turbo/ and node_modules), and cleaning the turbo daemon, seems to solve it for me.

PatrykKuniczak commented 1 week ago

Next month with that error, hymm