Automattic / a8c-ci-toolkit-buildkite-plugin

A caching plugin that can be invoked from your build script.
22 stars 5 forks source link

Add `pr_only_includes_changed_files_from` utility — to make it possible to skip CI jobs if e.g. no code change #131

Open AliSoftware opened 3 weeks ago

AliSoftware commented 3 weeks ago

Internal References: pe5sF9-38W-p2#comment-3981 , paaHJt-6pg-p2#comment-8788

The Problem

Some CI jobs only makes sense to run on code changes (e.g. UI Tests), yet they run even if the changes in the PR are only about a CHANGELOG.md or release-notes.txt or similar things.

Because some of those jobs might take quite a long time to run by nature (e.g UI Tests), it would be useful to be able to skip them if we detect that no code file were changed.

This is an annoyance that has been reported a couple of times, especially by developers providing feedback after doing their Release Rotation (e.g. @itsmeichigo in pe5sF9-38W-p2#comment-3981)

Another use case would be for monorepos (e.g. buildkite-ci) for which some steps only make sense to run if the corresponding subproject / subdirectory has changed.

Proposed Solution

Avoid false-negative for use case of skipping jobs

❌ At first one could think that we could implement a logic of "only run UI Tests if app or test files changed". For example only if the PR contains changes in *.swift files.

But this would risk leading to false-negative and skip UI Tests for cases where they should actually have run—e.g. if one have updated image assets in the codebase that ends up make a screen higher than before and thus making a UI test fail to scroll enough; or if a project uses wiremock or similar and updated a .json file describing network calls… And it would be cumbersome to list all the possible files that should led to a run of UI Tests, and easy to forget some cases.

👍 It would be less risky to approach this kind of logic the other way around, i.e. "only skip UI Tests if only non-app-nor-test files changed. (e.g. only skip UI Tests if the only files that changed are amongst README.md, RELEASE-NOTES.txt, …)

This is because it's less problematic to have the UI Tests run when they could have been skipped than the other way around.

So with that approach, even if we forget to list a file (e.g. Config/Version.xcconfig) for which the UI Tests could be skipped and UI Tests end up running, it's only a waste of CI time that could have been shaved, as opposed to UI Tests not being run when they should have and us failing to detect that the PR would break tests.

Implementation Suggestions

Add a a8c-ci-toolkit utility named something like pr_changed_files, which could take a flag of either --only-amongst[^1] or --contains [^2] + a list of file patterns, and return true or false accordingly.

[^1]: --only-amongst operator would be ideal for the use case of skipping steps like UI Tests [^2]: --contains operator would be ideal for use cases of a monorepo wanting to only run a step if files from a subproject/subdir changed

Here are a couple of git tricks we can use to get the list of changed files from the PR

MERGE_BASE=$(git merge-base origin/"${BUILDKITE_PULL_REQUEST_BASE_BRANCH}" HEAD)
CHANGED_FILES=$(git --no-pager diff --name-only "$MERGE_BASE" HEAD)

Or more succintly:

CHANGED_FILES=$(git --no-pager diff --name-only --merge-base "$BUILDKITE_PULL_REQUEST_BASE_BRANCH" HEAD)

Then we can loop through the list of CHANGED_FILES to check if we find any in that list that is not in the list provided by"$@".

Or if we want to avoid a double-nested-loop, we might be able to use diff to compare the two lists. Something like this maybe? (not tested)

diff <(git diff --name-only --merge-base "$BUILDKITE_PULL_REQUEST_BASE_BRANCH" HEAD | sort) <(printf '%s\n' "$@" | sort | uniq) | grep -q '^< '

And check if the command returned 0 (i.e. it found at least one line in the diff that was on the CHANGED_FILES list but was not in the "$@" list) then we have extra files not in the list and we should make the script exit with non-zero, while if the command returned 1 (i.e. it did not find any diff line starting with < so all the files listed on the left list were also present in the right list) we should make the script exit 0 and succeed.

AliSoftware commented 6 days ago

Proposed solution TL;DR:

# Use case 1: Skip UI Tests if the only changed files are README.md and/or RELEASE-NOTES.txt
# We could use this on most of our app repos that have UI Tests. The `--only-in` behavior avoids risk of false-negatives.
SKIP_UI_TESTS=$(pr_changed_files --only-amongst README.md RELEASE-NOTES.txt)
# Then use `if: !build.env('SKIP_UI_TESTS')` on the step to only run if not detected it can be skipped

# Use case 2: On a monorepo, only run some steps if files that are under the folder of a given subproject are changed
# We could use this for e.g. https://github.com/Automattic/buildkite-ci/blob/trunk/.buildkite/pipeline.yml#L57-L69
# The `--contains` behavior is a better fit for those kind of use cases
PROJECT1_CHANGED=$(pr_changed_files --contains project-1/)
PROJECT2_CHANGED=$(pr_changed_files --contains project-2/)
# Then use `if: build.env('PROJECT1_CHANGED')` on the steps for building `project-1`, to only run those if files for that project changed, and skip them otherwise

💡 Alternate name ideas for the operators:


Also, one could ponder about if the command should return:

  1. Always exit 0 and return a String—so that it's easy to assign to a variable with VAR=$(pr_changed_files). In that case it could be debatable if that string should be "true"/"false", or 1/0
  2. Or report its result using exit code—which would better follow POSIX conventions (and allow the command result to be used in if conditions in bash more natively); but that would require the call site to look more complicated when we want to assign to an env var, e.g. SKIP_UI_TESTS=$(pr_changed_file --only-amongst … && echo "true" || echo "false") and would be a trick easy to forget to add, leading to the shared-pipeline-vars script to fail early instead

The first solution seems better and less error-prone for the call site, but would really not be POSIX-compliant…