cirrus.yml: implement skips based on source changes

Luap99 commented 3 months ago

We do not have to test everything for each PR, we can know based on the source if we changed (i.e. machine code) and only run the tests then.

This implements it as skip conditions, due to the nature of yaml files we unfortunately cannot deduplicate everything, i.e. the is PR check and danger files apply to everything but as skip is only a single yaml string we cannot deduplicate parts of that string. If anyone knows a way to achieve this I like to hear it.

For now I implemented this for int, system, bud and machine tests. Once we are more comfortable with this I plan on adding it to other tests as well.

This will replace the current _bail_if_test_can_be_skipped logic as it covers more, marks tasks actually skipped in the github UI and works even for the windows/macos machine tests.

Does this PR introduce a user-facing change?

None

openshift-ci[bot] commented 3 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/containers/podman/blob/main/OWNERS)~~ [Luap99] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment

Luap99 commented 3 months ago

@cevich @edsantiago PTAL

The duplication is annoying for sure but I cannot think of something better, open for ideas.

My current idea to test this would be cherry-pick this commit into a new PR, remove .cirrus.yml from the danger files and then add a new commit on top with "real" changes to see that the task runs when we expected it too.

Luap99 commented 3 months ago

Ok some basic checks, not the full matrix but I think this is good enough:

Looks like it works as I expect and it clearly marks tests as skipped to the PR author/reviewer so it should be obvious what is actually being run.

cevich commented 3 months ago

Non-blocking / nice-to-have: Remove contrib/cirrus/CIModes.md and all the supporting only_if: logic in this PR or as a followup. It's too hard for humans to figure out what the combination of complex only_if and skip means. Let's just have one please :nerd_face:

Otherwise, this is nice work so far @Luap99

Luap99 commented 3 months ago

Non-blocking / nice-to-have: Remove contrib/cirrus/CIModes.md and all the supporting only_if: logic in this PR or as a followup. It's too hard for humans to figure out what the combination of complex only_if and skip means. Let's just have one please 🤓

Absolutely, I like to remove all of the special magic CI modes. However first I needed to be sure this here even works. Second I like to go incremental on this. Currently we cannot replace CI:DOCS as other tests such as compose or APIv2 are not yet covered by file based rules. So I like to keep them until I migrated all task over.

Luap99 commented 3 months ago

I think there might be instances where we don't want to skip anything, ie release PR's, even if certain files are not changed. In this case, it could be done by checking for edits in the rawversion.go, but there might be other instances where running everything would make sense. Maybe consider a override label? I'd be okay with this in a followup PR or something.

Yes thanks, I totally forget about the version file. I need to look into what options we have for a manual overwrite. Ideally nobody should need that but I agree that having such option might be useful just in case.

cevich commented 3 months ago

first I needed to be sure this here even works.

Yes, testing it (and changesInclude() in general) is hard since .cirrus.yml needs to be in the list :disappointed: I haven't been able to figure out a good way to get around that. The only thing I can think of is to leave that file out initially, do your testing, then stick it back in for final PR review.

There's also containers/automation_sandbox you're welcome to play in. Perhaps exchange all the gce_instance and ec2_instance with containers, and mock out all the scripts to /bin/true. That will save you from needing to mess with re-generating all the secrets and iterate quickly.

Luap99 commented 3 months ago

Well I already tested it here, see https://github.com/containers/podman/pull/23030#issuecomment-2176393337

Luap99 commented 3 months ago

Pushed a new version:

Fixed comment style and added test as requested by @edsantiago
added missing test/buildah-bud to the bud test file list in case someone only touches the diff files there.
added version/rawversion/* to the danger files as pointed out by @ashley-cui , this should make sure all release PRs run all tests. We never want to release something where tests could have told us something is broken.

Still missing is manual overwrite option, the best I can think right now is to use a special title like CI:ALL like what we do with CI:DOCS today, opinions on that?

cevich commented 3 months ago

opinions on that?

:+1:

edsantiago commented 3 months ago

Reading the conditions makes my brain hurt, but that should (fingers crossed) only be a problem now at review time, not in the future. Logicwise, I can't find any cases this won't handle, so I'm close to LGTM. I'd still like to think about it some more.

As for CI:ALL, it's either that or define a trigger file like force-ci. I think CI:ALL is slightly cleaner, and consistent with the CI:XXX habits we've trained ourselves on.

edsantiago commented 3 months ago

Possible suggestion for the sanity check:

diff --git a/contrib/cirrus/cirrus_yaml_test.py b/contrib/cirrus/cirrus_yaml_test.py
index 49d35d1ff..5ab43ce5d 100755
--- a/contrib/cirrus/cirrus_yaml_test.py
+++ b/contrib/cirrus/cirrus_yaml_test.py
@@ -63,14 +63,17 @@ class TestDependsOn(TestCaseBase):
     def test_skips(self):
         """2024-06 PR#23030: ugly but necessary duplication in skip conditions. Prevent typos or unwanted changes."""
         beginning = "$CIRRUS_PR != '' && !changesInclude('.cirrus.yml', 'Makefile', 'contrib/cirrus/**', 'vendor/**', 'hack/**', 'version/rawversion/*') && "
+        real_source_changes = " && !(changesInclude('**/*.go', '**/*.c') && !changesIncludeOnly('test/**', 'pkg/machine/e2e/**'))"

         for task_name in self.ALL_TASK_NAMES:
             task = self.CIRRUS_YAML[task_name + '_task']
             if 'skip' in task:
                 skip = task['skip']
                 if 'changesInclude' in skip:
-                    msg = ('{0}: invalid skip "{1}"'.format(task_name, skip))
-                    self.assertEqual(skip[:len(beginning)], beginning, msg=msg)
+                    msg = ('{0}: invalid skip'.format(task_name))
+                    self.assertEqual(skip[:len(beginning)], beginning, msg=msg+": beginning part is wrong")
+                    if 'changesIncludeOnly' in skip:
+                        self.assertEqual(skip[len(skip)-len(real_source_changes):], real_source_changes, msg=msg+": changesIncludeOnly() part is wrong")

     def not_task(self):
         """Ensure no task is named 'task'"""

Reasons: (1) check the tail part, it's too hard to check by hand; and (2) remove clutter from message, it makes the failure message unreadable.

ashley-cui commented 3 months ago

LGTM, I like CI:ALL

cevich commented 3 months ago

I'm not against CI:ALL, I think that's fine for now. I'm slightly more attracted to Ed's suggestion of a force-ci file though. Simply for the reason that it allows removing the CIModes.md docs, and it greatly simplifies the brain-work required to reconcile complex only_if and skip attributes. Maybe that could be considered in a follow-up PR?

Luap99 commented 3 months ago

As for CI:ALL, it's either that or define a trigger file like force-ci

Well I mean technically one can already just add a newline in Makefile/cirrus.yml or some other file from the danger zone so I rather not add another file for it.

I think I go CI:ALL for now and then we can reconsider once I am at the point where we can remove the other special titles such as CI:DOCS.

cevich commented 3 months ago

I think I go CI:ALL for now

Cool with me. I'm okay too with followup improvements to cirrus_yaml_test.py. No reason to make everyone wait for test improvements, nobody else is working on this skip-logic stuff.

In case it's not been said: Really nice work here Paul :clap: It's really important (to me at least) that the team is well engaged with our own automation systems. You've really "stepped up to the plate" recently and it's both noted and appreciated.

Luap99 commented 3 months ago

Pushed the test change from @edsantiago and now another commit to add CI:ALL support. This should be ready.

edsantiago commented 3 months ago

/lgtm

cevich commented 3 months ago

LGTM

containers / podman

cirrus.yml: implement skips based on source changes #23030

Does this PR introduce a user-facing change?