bitrise-steplib / bitrise-step-restore-npm-cache

0 stars 1 forks source link

Fallback for cache key can lead to unpredictable results and build failures #8

Open gcorne opened 3 months ago

gcorne commented 3 months ago

Troubleshooting

Useful information

Issue description

The restore NPM step falls back to any key that matches {{ .OS }}-{{ .Arch }}-npm-cache-, which can lead to unexpected behavior in NPM module resolution because extra dependencies may be in the node_modules tree.

Bitrise info

While not obvious, this build failed because node_modules/@shopify/eslint-plugin/node_modules/@typescript-eslint/scope-manager was present with an incompatible version that cause eslint to fail due to a cache hit.

ofalvai commented 3 months ago

Hello @gcorne, I see that your project uses yarn, version 1 specifically (I assume because that's what is installed on the stack by default).

We probably don't handle the Yarn 1 behavior correctly because we should cache the output of the yarn cache dir. Note to self: there is a different syntax in Yarn 2+.

In the meantime, I want to mention that we have corepack installed and enabled on the Ubuntu 22 stack, so if your package.json pins a specific Yarn version, corepack will automatically activate it behind the scenes.

gcorne commented 3 months ago

We probably don't handle the Yarn 1 behavior correctly because we should cache the output of the yarn cache dir. Note to self: there is a different syntax in Yarn 2+.

I don't think caching the global directory is super critical as its purpose is to speed up subsequent yarn install runs through leveraging the local copies of packages to avoid network overhead. That said, it could be valuable as a separate cache from the node_modules cache for a project.

The issue here isn't really related to yarn. It has to do with how node_modules is structured and module resolution works. In the case of the errors that I was seeing, I think the issue was that a build used the fallback key and ended up using the cache for a dependency update PR that resulted in a newer and incompatible version of a package showing up in the directory tree. In our case, there are multiple versions of the same dependency in yarn.lock.

node_modules/@shopify/eslint-plugin/node_modules/@typescript/scope-manager - v7.x (not in yarn.lock) node_modules/@typescript/scope-manager v6.x (in yarn.lock)

Due to how module resolution works, Node.js loaded v7.x not v6.x which was incompatible so exceptions were thrown and linting failed. This is why it is not safe to have a fallback that uses any NPM cache key.

ofalvai commented 3 months ago

Thank you for the example, I see what you mean.

You probably have more experience with Yarn than I do, so I'm curious if the following is how Yarn should work (at least in theory):

  1. Build VM starts with an empty state
  2. git-clone
  3. restore-npm-cache: restores a partially correct node_modules because of a fallback key
  4. yarn-install

At this point, shouldn't Yarn add the missing dependencies, as well as remove any dangling ones? I'd like to think about cache restoration like switching branches locally, where doing an npm install fixes my outdated state. And I know that sometimes I have to nuke the entire node_modules folder, but at least in theory, shouldn't the dependency manager handle any existing state and turn it into the one defined by the lockfile?

gcorne commented 3 months ago

At this point, shouldn't Yarn add the missing dependencies, as well as remove any dangling ones?

It would be nice if it worked that way reliably, but it doesn't with at least yarn@1.22.22. This is why rm -Rf node_modules is a not so uncommon solution to unexpected failures in the JavaScript world. Some of the more recent tools like pnpm are supposed to be better, but React Native until recently didn't support symlinks.