[Feature] Allow `changesetIgnoreOverrides` option for yarn workspaces list/foreach

zargold commented 2 years ago

[x] I'd be willing to implement this feature (contributing guide)
[x] This feature is important to have in this repository; a contrib plugin wouldn't do

Describe the user story

My current workflow is to run test on all files considered recursively changed by the yarn workspaces list command. In other words, I have workspaceA which is the grandparent to workspaceC (through workspaceB) if workspaceA changes I'd like to run tests on workspaceC and workspaceB. The following command identifies those files perfectly: yarn workspaces list -vR --since=HEAD~. In other words, this yields a list of workspace names which I can then send to yarn workspaces foreach.

In addition to wanting to run tests in all packages that depend on workspaceA if some super important global file changes like yarn.lock (because that means I'm using a new version of browserslist or jest or React or lodash etc) I would want to run yarn workspaces foreach on all files. Unfortunately, due to this (IMO undesirable line of code) changes to yarn.lock are ignored without me asking for it. If I change the README.md then yarn workspaces list -vR --since=HEAD~ correctly outputs . meaning the root changed, but if I change something way more important to any and all workspaces the yarn.lock the same behavior, unexpectedly, DOES NOT occur. (If the only file changed is yarn.lock the output is blank as in nothing changed since HEAD~ which is inaccurate but possibly serves someone a purpose.)

So while I'd love to say that this is a bug potentially many people would find my proposed change unnecessarily breaking: https://github.com/yarnpkg/berry/blob/20612e82d26ead5928cc27bf482bb8d62dde87d3/packages/plugin-workspace-tools/sources/commands/foreach.ts#L141 looks for potential root changes (yarn.lock is not in a workspace but is certainly part of the root).

// Note: yarn artifacts are excluded from workspace change detection
// as they can be modified by changes to any workspace manifest file.

The comment refers to arbitrarily ignoring yarn artifacts as they're not in a specific workspace manifest, but they should not actually be ignored as changed root files/rootCandidate:

const rootCandidates = this.since
      ? Array.from(await gitUtils.fetchChangedWorkspaces({ref: this.since, project}))

In fact we know that yarn.lock (or another yarn artifact) is a Root file because it is not in a workspace. But that doesn't necessarily make it less "changed" (aka less important for someone who just ran yarn workspaces list --since=HEAD~).

As a result of this feature to ignore changes in root files when those root files are yarn.lock, I will now have to manually determine if yarn.lock changed and if so consider that a root change that results in me wanting to run ALL tests.

Describe the solution you'd like

I would really want to remove

ppath.resolve(project.cwd, project.configuration.get(`lockfileFilename`)),

because the yarn.lock file is actually very important and if it changed then people probably want to consider this a root change, but I think that may be a breaking change to many reliant scripts.

It would be great to have an flag option added to yarn workspaces list and/or yarn workspaces foreach that can ignore the ignore list for changed files completely like --overrideIgnoreList will consider yarn artifacts to be root files which changed. Or even better, --overrideInclude={yarn.lock}. In other words, if you use the flag --overrideInclude={yarn.lock} you should expect that it will consider the root to have changed if yarn.lock changed. If you use --overrideIgnoreList you should expect that it will consider the "root" to have changed in foreach/list if any yarn artifact changed. Ideally, we could even add a flag like --exclude={} which would mean don't bother checking if this file changed (which would help when someone has let's say a CHANGELOG.md or README.md if those files changed despite being in the root or even in a specific workspace-- I may not need to re-run yarn run test within a repo if only the README changed or retest the whole repository because a global/root trivial file changed (as opposed to potentially breaking changes resulting from yarn.lock).

Describe the drawbacks of your solution

If anyone relied on ppath.resolve(project.cwd, project.configuration.get(lockfileFilename)), and we remove it that would be a breaking change they need to account for (potentially even yarn does rely on that in some base layer that I'm unaware of). But just adding a feature flag should be safe we can check the --overrideIgnoreList after the ignore list and add back files that were ignored and it won't affect any existing users.

Describe alternatives you've considered

I found it very misleading that yarn workspaces list --since=HEAD~ lyingly excluded root when I had in fact changed the root file and this was arbitrarily excluded from the command. It had changed since HEAD~ yet despite running it dozens of times it didn't work. One other alternative is to at least warn users about this behavior in the README for yarn workspaces list and foreach.

jdanil commented 2 years ago

Hi @zargold, I just thought I'd try to explain my thinking here when this was implemented. I was trying to optimise for what I thought would be the most common workflows when using change detection. Previously I'd used lerna in a similar way, but one thing I found frustrating was when I modified a single manifest file in the project (and ran install to update the lockfile) it would be detected as a change in the root workspace and consider all workspaces as having been affected. I felt this reduced the usefulness of the change detection when dependencies were being modified.

I think for most cases ignoring the lockfile should result in accurate change detection. Strictly (and especially when using node linkers like pnp or pnpm) dependencies should be declared in the workspaces they're used in. So workspaces impacted by a change should have a respective change in their own manifest files.

The situations I think this may not cover are projects that use the node modules linker and rely on hoisting to share dependencies. Although I'm not quite sure we should be reducing the efficiency of the change detection to handle these cases of unsafe access. Another situation I can think of is where a dependency might be declared in one workspace, but run via a global script in another (as this dependency is somewhat implicit). Lastly, if the lockfile was updated without changes to a manifest file (e.g. regenerating the lockfile or modifying it directly).

All that said, I wouldn't be opposed to making this configurable. As long as the default behaviour is still most useful for the majority.

Ideally, we could even add a flag like --exclude={} which would mean don't bother checking if this file changed (which would help when someone has let's say a CHANGELOG.md or README.md if those files changed despite being in the root or even in a specific workspace-- I may not need to re-run yarn run test within a repo if only the README changed or retest the whole repository because a global/root trivial file changed (as opposed to potentially breaking changes resulting from yarn.lock).

Not sure if you are aware, but you should be able to achieve this via changesetIgnorePatterns. Its probably not documented very well, but the workspaces list and workspaces foreach commands also look at this config to determine if a file change should be ignored or not.

zargold commented 2 years ago

OK hi @jdanil thank you for your very specific feedback and consideration--whenever I open a GitHub issue I always worry that it will just get lost in the sea, but you answered quickly so 🥇 to you/yarn team.

Totally agree that most of the yarn artifacts should have their changeset ignored by default, but yarn.lock is too risky for me to be there. I won't get too deep into the specifics, but there are many legitimate instances where you can update yarn.lock only (without updating a specific workspace or package.json) and have that destroy workspaces. For example, one can ran yarn up browserslist and it will only update your root yarn.lock (without changing any package.json for packages that depend on caniuse in their babel) but break webpack builds since webpack like Internet Explorer (who's death day we recently celebrated 🎂 🥳) doesn't support all of the modern ES features without specific babel plugins.

I'm fine with having a separate option to override default or ignored files. Not sure which option you'd choose but I think the simplest would be something like changesetRegardOverride: ["yarn.lock", "special.json"]. Using something like that you can let's say have changesetIgnorePatterns: "**/*.lock,**/*.json" to ignore all .json and .lock files but then specifically override the ignore list (from any sources--hardcoded or within your config) by including them in the proposed changesetRegardOverride field in .yarnrc.yml.

Not sure if you are aware, but you should be able to achieve this via changesetIgnorePatterns. Its probably not documented very well, but the workspaces list and workspaces foreach commands also look at this config to determine if a file change should be ignored or not.

I do see how that option is critical and would be happy to add to the language of https://yarnpkg.com/cli/workspaces/list and foreach to indicate something like "NOTE: these commands are eligible for .yarnrc.yml configuration options like changesetIgnorePatterns which would allow you to ignore changes to specific files". It wasn't critical for my workflow but I know that at many companies they have a global CHANGELOG.md that has to get updated upon each change and that would wreak havoc for them for sure.

Updating to more closely align the name to the exisiting option.

yarnpkg / berry

[Feature] Allow `changesetIgnoreOverrides` option for yarn workspaces list/foreach #4610