ability to `waive` non-blocking failures in tests

lsm5 commented 1 month ago

Description

In podman and related tools, we're running build and test jobs on both RHEL and CentOS Stream. Often either RHEL or CentOS Stream tests may fail depending on underlying issues like

golang / rust compiler version differences
new changes in CentOS Stream that are not yet in RHEL
upstream fixed to account for new changes in CentOS Stream which haven't yet landed in RHEL

Many of these failures are non-blocking and often unrelated to the PR submitted.

What I'm looking for: /packit waive foo-test, where foo-test could be a value for either a test identifier or the packages key.

Example: https://github.com/containers/buildah/pull/5514 where centos stream tests are failing because of changes in golang that haven't yet landed in RHEL. Over there, I'd love to be able to comment /packit waive buildah-centos and then have the red flag on those tests changed to a neutral color maybe with a Waived status against it.

Benefit

This feature request follows the Fedora bodhi model of running all tests and then waiving known non-blocking issues.

I suspect depending on the label approach with manual triggers might lead to tests not being run because people forgot or were simply not aware.

Importance

It's important in the sense that it will:

reduce annoyance for maintainers who are not fully onboard with packit
not get in the way of repos blocking on certain tests to pass before PRs can be merged (hope github allows this).

What is the impacted category (job)?

General

Workaround

[ ] There is an existing workaround that can be used until this feature is implemented.

Participation

[x] I am willing to submit a pull request for this issue. (Packit team is happy to help!)

lachmanfrantisek commented 1 month ago

Hi @lsm5 !

We've shortly discussed this within a team and have a few questions to properly understand the use case:

Is it ok to do a new test run or should we try to just "fix" the statuses? (Changing the state from the final state is not possible on GitLab.)
Just to check, you don't know this beforehand, right? We were discussing that it's possible on tmt level to define result interpretation.
Are you interested in the result of the hidden/waived job or skip is what you really want?
Do you need to do this for (Packit) jobs or ideally at a more granular level?

lsm5 commented 1 month ago

Hi @lsm5 !

We've shortly discussed this within a team and have a few questions to properly understand the use case:

Thanks!

Is it ok to do a new test run or should we try to just "fix" the statuses?

Not sure what you mean by new test run and fix status in this particular case. The tests in question will consistently fail so I don't think the result will be any different.

(Changing the state from the final state is not possible on GitLab.)

Ack, is it possible to mark a feature as GitHub-only?

Just to check, you don't know this beforehand, right? We were discussing that it's possible on tmt level to define result interpretation.

Thanks for sharing that about TMT. Didn't know this.

Are you interested in the result of the hidden/waived job or skip is what you really want?

Hmm, maybe both. Let's say I review 10 PRs in a day and I see the same failure in the first 2-3 and decide to waive them. It would be convenient to skip the failures on other PRs. Though if I have to choose, I think seeing the failure and manually waiving would be more important than skipping something altogether. Let me know if I misread your question.

Do you need to do this for (Packit) jobs or ideally at a more granular level?

Now that you mention it, more granular level is probably also desirable. In the PR I linked as example, all CentOS Stream jobs are failing so that's why I had the entire job in mind, but if it's only centos-stream-9 failing and centos-stream-10 passing, then waiving for each target would be preferable.

lachmanfrantisek commented 1 month ago

Is it ok to do a new test run or should we try to just "fix" the statuses?

Not sure what you mean by new test run and fix status in this particular case. The tests in question will consistently fail so I don't think the result will be any different.

By new test run I meant to run all the test jobs again as a reaction to the comment command and mark the result of this specific job with a neutral state. By fix the statuses I meant to not trigger any new test run in TF after the comment and just re-set an existing status to have a neutral state.

The first option is a bit more wasteful but probably easier to do. The second might have some issues on GitLab.

Regarding GitLab, we tried to be forge-independent as much as possible but sometimes, it's just not possible -- e.g. check runs are GitHub specific.

lachmanfrantisek commented 1 month ago

Are you interested in the result of the hidden/waived job or skip is what you really want?

Hmm, maybe both. Let's say I review 10 PRs in a day and I see the same failure in the first 2-3 and decide to waive them. It would be convenient to skip the failures on other PRs. Though if I have to choose, I think seeing the failure and manually waiving would be more important than skipping something altogether. Let me know if I misread your question.

Thanks! My idea was that if we do a new run as a reaction to the comment, we could just not run the job that is waived but it won't work because the waived result will be preserved on the GitHub UI. (We need to reset this status to get rid of the failure.)

lachmanfrantisek commented 1 month ago

Just to check, you don't know this beforehand, right? We were discussing that it's possible on tmt level to define result interpretation.

Thanks for sharing that about TMT. Didn't know this.

If we were able to specify the tests to waive statically, we could define this just on the tmt level.

Alternatively, we can think about a generic way how to (re)define tmt context when running a comment command but it would be hard to provide a nice syntax. Especially if you would need this for a more granular level.

lsm5 commented 1 month ago

Is it ok to do a new test run or should we try to just "fix" the statuses?

Not sure what you mean by new test run and fix status in this particular case. The tests in question will consistently fail so I don't think the result will be any different.

By new test run I meant to run all the test jobs again as a reaction to the comment command and mark the result of this specific job with a neutral state. By fix the statuses I meant to not trigger any new test run in TF after the comment and just re-set an existing status to have a neutral state.

The first option is a bit more wasteful but probably easier to do. The second might have some issues on GitLab.

Regarding GitLab, we tried to be forge-independent as much as possible but sometimes, it's just not possible -- e.g. check runs are GitHub specific.

How complicated would it be to prefer the second and default to first depending on github / gitlab ?

lachmanfrantisek commented 1 month ago

How complicated would it be to prefer the second and default to first depending on github / gitlab ?

@lsm5 Do you mean to re-run the test suite on GitLab and change the status to neutral on GitHub? I think it's doable but maybe I would slightly prefer to behave in the same way.

lsm5 commented 1 month ago

@lsm5 Do you mean to re-run the test suite on GitLab and change the status to neutral on GitHub?

Yes

I think it's doable but maybe I would slightly prefer to behave in the same way.

SGTM. Thanks @lachmanfrantisek

packit / packit-service