vercel / turborepo

Build system optimized for JavaScript and TypeScript, written in Rust
https://turbo.build/repo/docs
MIT License
26.22k stars 1.81k forks source link

[turborepo] cache hits after sending node_module and .turbo folders to trash #5087

Closed dipunm closed 8 months ago

dipunm commented 1 year ago

What version of Turborepo are you using?

1.9.9

What package manager are you using / does the bug impact?

pnpm

What operating system are you using?

Mac

Describe the Bug

If deleting node_module and .turbo files using MacOS Finder, turbo will somehow match the cache and skip execution of tasks. This is also the case if deleting via VSCode.

If deleting using command line rm -rf node_modules .turbo, everything works as expected.

Expected Behavior

I expect that after deleting the files, the cache is empty and needs to be recreated

To Reproduce

I created a very simple, very empty project with no dependencies or source files:

Files:

.gitignore
CODEOWNERS
package.json
pnpm-lock.yaml
README.md
turbo.json

turbo.json:

{
    "$schema": "https://turbo.build/schema.json",
    "pipeline": {
        "build": {
            "outputs": ["dist/**"]
        },
        "lint": {},
        "test": {}
    }
}

package.json scripts:

  "scripts": {
    "build": "echo 'turbo::build'",
    "test": "echo 'turbo::test'",
    "lint": "echo 'turbo::lint'"
  }
  1. turbo lint test
  2. Cache miss, new cache files are created
  3. rm -rf node_modules .turbo
  4. turbo lint test
  5. Cache miss, new cache files are created
  6. Delete files using Finder (Using MacOS Ventura 13.3.1 (22E261))
  7. turbo lint test
  8. Cache hit!
  9. Empty trash using Finder
  10. Cache hit!

The easiest way to resolve this weird situation is to change the turbo.json file to invalidate the cache and then focus on using rm -rf node_modules .turbo from the command line.

Another note: When the cache hits after deleting the files, since the original logs are deleted, the output is empty:

otjs-libraries (main) ✗ turbo lint test
• Running lint, test
• Remote caching disabled
lint: Skipping cache check for //#lint, outputs have not changed since previous run.
lint: cache hit, replaying output 255ca524bc7cd422
test: Skipping cache check for //#test, outputs have not changed since previous run.
test: cache hit, replaying output ab2c5da517caa23e
test: 
lint: 
lint: > otjs-libraries@1.0.0 lint /Users/dmistry/Projects/opentable/otjs-libraries/otjs-libraries
lint: > echo 'turbo::lint'
lint: 
lint: turbo::lint
test: > otjs-libraries@1.0.0 test /Users/dmistry/Projects/opentable/otjs-libraries/otjs-libraries
test: > echo 'turbo::test'
test: 
test: turbo::test

 Tasks:    2 successful, 2 total
Cached:    2 cached, 2 total
  Time:    101ms >>> FULL TURBO

 otjs-libraries (main) ✗ turbo lint test
• Running lint, test
• Remote caching disabled
test: Skipping cache check for //#test, outputs have not changed since previous run.
test: cache hit, replaying output ab2c5da517caa23e
lint: Skipping cache check for //#lint, outputs have not changed since previous run.
lint: cache hit, replaying output 255ca524bc7cd422

 Tasks:    2 successful, 2 total
Cached:    2 cached, 2 total
  Time:    115ms >>> FULL TURBO

Reproduction Repo

No response

tknickman commented 1 year ago

While this may seem incorrect, this is actually expected behavior, and we do a lot of work behind the scenes to make sure that's true.

Before turbo even checks if it should restore from cache it checks two things (this is slightly simplified):

  1. Have the inputs changed?
  2. Do the outputs still exist?

If the inputs haven't changed, and the outputs still exist, turbo knows that it doesn't need to do anything, as re-running with the same inputs should produce the same outputs. So, it will output a cache hit with the message: outputs have not changed since previous run.

This is because turbo knows that your inputs have stayed the same, and the outputs still exist (even though the cache is gone)

To force turbo to get a cache miss:

  1. Remove the local cache
  2. Make sure remote cache is disabled
  3. Remove the outputs for the task in your workspaces (if the task doesn't have outputs, make sure the .turbo directory in your workspace is removed)
dipunm commented 1 year ago

Just to focus on the last part - To force turbo to get a cache miss:

  1. local cache is stored in node_modules afaik, this was sent to trash
  2. Remote caching was never turned on but I will look further into this to see if I need to take any extra steps for this.
  3. My tasks have no outputs and the .turbo directory was sent to trash.

I observed same results even after clearing the trash. It kind of feels like a MacOS bug, but I'd like to see if anyone else can reproduce this issue and/or if there is any known reasons for it.

tknickman commented 1 year ago

Apologies for the preemptive close! That does sounds like something else might be going on, let me take a deeper look

tknickman commented 1 year ago

Ok yea this is a weird one! I can reproduce this when deleting the files from my editor, but when using rm on the command line it works as expcted. This is likely an issue with our file watcher that monitors inputs / outputs.

  1. rm -rf node_modules/.cache/turbo/ .turbo/ - works (cache miss)
  2. manually removing the same directories in my editor (vscode) results in a cache hit with missing outputs.
tknickman commented 1 year ago

cc @arlyon this looks to be daemon / file watching related, could you help take a look?

selbyk commented 1 year ago

I've experienced the same trying to write a build clean script for our monorepo. Thought I was going insane trying to find what else turbo was using in addition to **/node_modules/.turbo

jamesg1 commented 1 year ago

Ok yea this is a weird one! I can reproduce this when deleting the files from my editor, but when using rm on the command line it works as expcted. This is likely an issue with our file watcher that monitors inputs / outputs.

  1. rm -rf node_modules/.cache/turbo/ .turbo/ - works (cache miss)
  2. manually removing the same directories in my editor (vscode) results in a cache hit with missing outputs.

I'm getting the same. I got it cleared when I changed an output in turbo.json to something else, run build and exit straight away, then change back the cache is cleared.

Probably be good to have an option to disable this type of extra layer of cache checks and just read off the .turbo folder cache.

gsoltis commented 1 year ago

@jamesg1 what version of turbo are you using?

By way of explanation, with a few too many details: macos, when deleting from some applications including VSCode and Finder, treats the delete as a move (presumably to trash?). One implementation of our filewatching didn't have the appropriate flag set on macos to be notified of moves along the path to the directory we were watching. This should not be the case with latest patch (1.10.7 as of now), but if it is, please flag it.

For other turbo options that may help with what you're trying to accomplish:

Lastly, if you are implementing a cache-cleaning script, would you mind describing what you are trying to accomplish with it? If it's something that seems like Turborepo should help with, we may be able to make it easier and/or implement one that stays up-to-date.

jamesg1 commented 1 year ago

@gsoltis currently on 1.9.9. Will upgrade and give it a go, thanks for giving that daemon option. I got caching working fine on my CI pipeline and did originally just had force enabled until caching was complete on my CI pipeline.

jamesg1 commented 1 year ago

@dipunm did you try the latest turborepo version there was a fix for the issue.

RossMcMillan92 commented 1 year ago

Seeing similar here. I've ran rm -rf node_modules **/.turbo **/node_modules, then pnpm i and I'm still getting cache hits when I run scripts. Is there anywhere else that could be coming from?

arlyon commented 8 months ago

Hey folks, we have released the rust version with v1.12 which came with a number of fixes. If this is still an issue with that version, feel free to open another issue, since the underlying code has been completely replaced.

Cheers!