microsoft / rushstack

Monorepo for tools developed by the Rush Stack community
https://rushstack.io/
Other
5.95k stars 602 forks source link

[heft] "heft clean" does not delete the .heft folder, sometimes producing an unfixable Git enlistment #2660

Open elliot-nelson opened 3 years ago

elliot-nelson commented 3 years ago

Summary

While working in the rushstack repo, I seem to get into a state where I cannot fix my workspace... I continually get what appears to be buggy typescript behavior in the rush-lib project. The only solution I can find is to delete the entire folder and re-clone from GitHub.

Repro steps

# Start in an existing workspace, or clone a fresh one
git checkout @microsoft/rush_v5.42.4
rush install
rush build

# Everything is good so far! Now let's return to master (in this case, a specific commit)
git checkout b8c4591c0da518becf1584a9d564bb2799f0e83d
rush install
rush build

# Everything still looks good! But, ALL of the projects rebuilt from cache.
# Let's force an actual compilation...
echo "//hello" >> apps/rush-lib/src/start.ts
rush build

==[ FAILURE: 1 project ]=======================================================

--[ FAILURE: @microsoft/rush-lib ]--------------------------[ 25.43 seconds ]--

  ● Git › normalizeGitUrlToHttps › correctly normalizes URLs

    TypeError: Git_1.Git.normalizeGitUrlForComparison is not a function

       7 |   describe('normalizeGitUrlToHttps', () => {
       8 |     it('correctly normalizes URLs', () => {
    >  9 |       expect(Git.normalizeGitUrlForComparison('invalid.git')).toEqual('invalid');
         |                  ^
      10 |       expect(Git.normalizeGitUrlForComparison('git@github.com:ExampleOrg/ExampleProject.git')).toEqual(
      11 |         'https://github.com/ExampleOrg/ExampleProject'
  ...122 lines omitted...
         |                        ^
      58 |     }
      59 |   });
      60 | });

      at Object.<anonymous> (src/cli/test/CommandLineHelp.test.ts:57:24)

[jest] Error: 2 Jest tests failed
Encountered 1 errors:
  [jest] 2 Jest tests failed

Projects failed to build.

At this point, your workspace is now "stuck" -- all attempts at building rush-lib seem to produce a bad, broken version of lib/logic/Git.js that is missing the mentioned method (normalizeGitUrlToHttps).

Ways I attempted to fix this problem:

Ways that did fix this problem:

Details

I am not sure what causes the problem or why it's broken. Because it seems to be misbehaving typescript, I wonder if moving between these specific versions in the repo cause a version of TypeScript to be installed that is incompatible with rushstack... (Possibly replace "TypeScript" with some other npm dependency.)

There may not be a way to prevent this type of bug, but I am interested in suggestions on possible ways to get out of this situation once you are in it, short of the nuclear option.

Standard questions

Please answer these questions to help us investigate your issue more quickly:

Question Answer
@microsoft/rush globally installed version? 5.44.0
rushVersion from rush.json? 5.44.0 (in master anyway)
useWorkspaces from rush.json? true
Operating system? Mac
Would you consider contributing a PR? Yes, if there's a bug to fix
Node.js version (node -v)? 12.17.0
octogonz commented 3 years ago

This did not repro on Windows.

octogonz commented 3 years ago

Does not repro on Linux either. When I get a chance I will dust off the MacBook and try it there.

octogonz commented 3 years ago

It also did not repro on Mac.

Of note, I did get this warning (which @iclanton has been investigating):

==[ @rushstack/eslint-patch ]====================================[ 3 of 105 ]==
"tar" exited with code 1 while attempting to create the cache entry. Rush will attempt to create the cache entry with a JavaScript implementation of tar.

Also this step did a full rebuild rather than restoring from the build cache:

# Everything is good so far! Now let's return to master (in this case, a specific commit)
git checkout b8c4591c0da518becf1584a9d564bb2799f0e83d
rush install
rush build

And thus this step did not encounter errors:

# Everything still looks good! But, ALL of the projects rebuilt from cache.
# Let's force an actual compilation...
echo "//hello" >> apps/rush-lib/src/start.ts
rush build
octogonz commented 3 years ago

@elliot-nelson I wonder if your problem is related to the Heft build cache. In the Rush Stack configuration, the <project-folder>/.heft folder is NOT cleaned by heft clean. For example after running heft clean you will find that the .heft/build-cache folder still exists.

This design assumes that every task which writes into the .heft folder is perfectly transactional and interoperable with every historical version of the tooling, which I've argued is wildly optimistic.

Could you test this theory by doing something like this:

# BE CAREFUL WITH COMMANDS THAT DELETE FOLDERS
$  find . -type d -name ".heft" -exec rm -Rf {} \;

(If that fixes your problem, maybe this is sufficient evidence to argue for heft clean to delete the entire .heft folder by default.)

elliot-nelson commented 3 years ago

Interesting...

In fact, that command did fix the problem. It deleted .heft in every subfolder, then rush build once again restored almost every project from cache (except for rush-lib, since it hadn't built successfully yet), and then rush-lib built successfully without the weird Git.js error.

I'm not sure why you didn't repro it on your Mac, but it does seem like something in apps/rush-lib/.heft is left there in version 5.42.4 and then breaking in master.

octogonz commented 3 years ago

Awesome, thanks.

octogonz commented 3 years ago

CC @iclanton @dmichon-msft

Elliot is pretty experienced with Node.js tools. If the build can get into a state that even he cannot figure out how to undo, that seems like a high priority bug.

octogonz commented 3 years ago

Fundamentally this seems like a design problem due to conflicting goals:

Proposed redesign

Maybe we should introduce a built-in command rush clean that is expected to do a pristine clean. Usage would be like this:

package.json

"scripts": {
  "build": "heft test",
  "clean": "heft clean"
}

For people who want to roll the dice and keep their .heft folder for a faster build, we would provide a rush clean --keep-caches parameter, which invokes heft clean --keep-caches. (We'd stipulate that the "clean" script is expected to accept this parameter, whether or not it is used.)

These 3 commands would be equivalent shorthands:

Also rush purge should always perform rush clean.

To ensure backwards compatibility with the old design, rush clean will delete package-deps.json (invalidating the incremental build state) and allows a missing "clean" script (for the old model where "build" was expected to clean).

With this redesign, Elliot's bad state would be fixed by doing rush install --purge and/or by doing rush clean.

elliot-nelson commented 3 years ago

To confirm the diagnosis, I started over from the top again and followed the steps (up to the point where I get the build error).

At that point, I can immediately fix the problem by running:

cd apps/rush-lib
heft clean --clear-cache
rush build

So it does seem like it's the heft cache. It also looks like heft itself does not have any kind of logic in it to clear the cache conditionally (maybe because the version that created it has changed, or the typescript compiler that produced the cached files has changed). That might be logic that's interesting to pursue, regardless of whether there's a "rush clean" feature.

dmichon-msft commented 3 years ago

There seem to be issues with failing to update the output files if they don't get cleaned during the build process, which is, for whatever reason, currently not default behavior of Heft.

iclanton commented 3 years ago

It'd be pretty easy to add the .heft/build-cache folder the list of default clean paths.

@dmichon-msft thinks that removing hardlinking to the .heft/build-cache folder may fix the problem. David - you mind taking a look into hardening this code a bit?