CI is double-running on PRs

pkienzle commented 2 years ago

Describe the bug All CI tests, installers, and doc builds are running twice on commits to branches once a PR is created. Simple origin, most of our GH actions scripts use a trigger of [push,pull_request]

I propose that we run tests on [push], installers on [pull_request] and docs probably also on [push].

To Reproduce PR commits have 24 run tests instead of the proper 10.

Expected behavior Tests should not be duplicated.

Originally posted by @pbeaucage in https://github.com/SasView/sasview/issues/2307

llimeht commented 2 years ago

I'm not seeing an actual problem being identified here.

Over the years I have made several PRs against all the core sasview modules to increase the amount of testing, because artificial (and quite unusual) limitations to when CI was run were hindering my contributions.

I'll note that much of what suddenly became "off topic" in the previous discussion was rationale for why all these tests were needed in different situations.

Strong -1 on this from me.

butlerpd commented 2 years ago

Sounds like a vote for "close as won't do" and leave as discussion topic until some agreement is achieved on what is needed and how to achieve it? which I think was the reason for the move in the first place?

pbeaucage commented 2 years ago

There's a really simple solution that got buried in the lengthy discussion of git minutiae and the prevailing git usage style of the project: simply add an if-else block to the GH actions script that skips the "PR" test targets if the PR's origin is a branch within the main sasview/sasview repo (such that "push" already covered all needed tests). People that want to use the private fork workflow get the tests, but we don't double run the tests against the overwhelming majority of project contributions that come in via branches of the main sasview/sasview repo.

Yes, GH actions is free, but if people use it moronically, for example, running the exact same test workload twice on the same code, it's set up to go the way of Travis and all the other CI services before it in gutting support for small, open source projects without sponsorship. If you aren't familiar, look at how many different CI providers conda-forge needs to use to do their significant workload for free, that could well be the future if the resource isn't used responsibly. Heck, even if we assume Microsoft will keep it free forever there is still some impact to double running the tests. Our fixed number of runner slots sets up a long queue for runs during busy times like code camps which holds up reviews. And at some level we're burning roughly a node-hour of energy on 100% wasted, double-run tests and installer builds for each commit to a branch of sasview/sasview that's in a PR. I don't have a conversion from Azure cloud hours to kg of CO2 but it's a real resource being wasted for absolutely no benefit whatsoever. That's the 'actual problem being identified'.

lucas-wilkins commented 2 years ago

I, for one, would like to see a simple solution to this implemented.

butlerpd commented 2 years ago

In this "discussion vs issue" discussion, my understanding from @lucas was that "issues" should be for agreed actionable items and everything else is a discussion until there is an agreement on an path forward? If so the question I have is whether @pbeaucage solution is acceptable to all the current needs. I think so?

llimeht commented 2 years ago

There's a really simple solution that got buried in the lengthy discussion of git minutiae and the prevailing git usage style of the project: simply add an if-else block to the GH actions script that skips the "PR" test targets if the PR's origin is a branch within the main sasview/sasview repo (such that "push" already covered all needed tests).

In the git minutiae is the reason why that is a bad idea.

The git workflow used by this project means that the code tested on push is a long long way from the merged code. Not testing the PR means no-one knows that the CI fails until after the merge button has been hit. That's a huge regression.

lucas-wilkins commented 2 years ago

My "convert to discussion" finger is itching.

lucas-wilkins commented 2 years ago

Can we just build a list of jobs, and remove duplicates before running?

pkienzle commented 2 years ago

This is an actionable ticket with a well defined resolution state. Keep it open until we no longer have duplicate jobs or we decide that duplicate jobs are acceptable.

On Thu, Nov 10, 2022 at 10:46 AM Lucas Wilkins @.***> wrote:

My "convert to discussion" finger is itching.

— Reply to this email directly, view it on GitHub https://github.com/SasView/sasview/issues/2378#issuecomment-1310490690, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADTS4DL6W27MDGY3KLJBQLWHUKFZANCNFSM6AAAAAAR2TJ3CE . You are receiving this because you authored the thread.Message ID: @.***>

llimeht commented 2 years ago

This is an actionable ticket with a well defined resolution state. Keep it open until we no longer have duplicate jobs or we decide that duplicate jobs are acceptable.

If that's the criterion then we can already close this issue. There are no duplicate jobs run and there never have been. There are different jobs running against different code.

This might sound like a very pedantic point, but actually is the point.

llimeht commented 2 years ago

If the goal is to reduce CI minutes, there are other easy wins out there:

discuss what Python interpreter the next release will go out with (it shouldn't still be Python 3.8) and run the sasview tests only that version and whatever the most recent interpreter release is. (It's important to keep the most recent release or there move to the next version just becomes progressively more painful)
don't build the installers except when getting close to the release, while understanding that probably means more work required to fix them closer to the release
run tests locally before pushing

lucas-wilkins commented 2 years ago

finger itching increases

don't build the installers except when getting close to the release, while understanding that probably means more work required to fix them closer to the release

I'm down with that, checking the installers every time is a real pain - others probably disagree.

run tests locally before pushing

But will we?

@llimeht I'm a little unclear about exactly what the "duplicates" are doing differently from each other. Can you elaborate on this? (apologies if I'm asking you to repeat yourself), and is this difference dependent on / affected by where the request comes from?

pkienzle commented 2 years ago

Is there a hook we can run when we ask to merge/rebase/squash into main? If those tests fail then reject the merge?

This would allow for fewer tests during the code review process with the caveat that accepting the PR will take longer.

No, this is still on topic regarding the actionable, and not a reason to dump it into discussion.

butlerpd commented 2 years ago

LOL ... so it sounds to me like what we need is an issue/discussion to define what a discussion actually is 😄 For the record, as I read through this, this seems to me like what, IMHO, would classify as a discussion since it seems there is a fundamental disagreement on the premise even (that there is duplication of CI runs on PRs)?

On that point however, I am not sure I follow and echo @lucas-wilkins question to @llimeht

llimeht commented 2 years ago

@llimeht I'm a little unclear about exactly what the "duplicates" are doing differently from each other. Can you elaborate on this? (apologies if I'm asking you to repeat yourself)

A concrete example:

A run on push to the branch 2325-… — actions summary. Looking at one of the runs to see what code got tested, (pay particular attention to the git checkout command):

Checking out the ref
  /usr/local/bin/git checkout --progress --force -B 2325-testing-needs-to-be-hermetic-against-user-config refs/remotes/origin/2325-testing-needs-to-be-hermetic-against-user-config
  Switched to a new branch '2325-testing-needs-to-be-hermetic-against-user-config'
  branch '2325-testing-needs-to-be-hermetic-against-user-config' set up to track 'origin/2325-testing-needs-to-be-hermetic-against-user-config'.
/usr/local/bin/git log -1 --format='%H'
'63e53a8680488a0a9991b6293b3142709a2446c1'

that is, the contents of the branch got tested.

A run on pull-request (the PR requesting to merge 2325) at the same time — actions summary

Looking at one of the runs to see what code got tested:

Checking out the ref
  /usr/local/bin/git checkout --progress --force refs/remotes/pull/2328/merge
  HEAD is now at 774d7d9 Merge 63e53a8680488a0a9991b6293b3142709a2446c1 into 70434fb42227775ef61b595df34e1a650f9fbcc9
/usr/local/bin/git log -1 --format='%H'
'774d7d95c8702142b385e68e61eb6c1652194a19'

That's a different set of git commands, checking out a different ref, and ending with a different hash because it's going to test different code. This time, the merged code got tested. If we turned off CI on PRs, then the CI wouldn't get run at all on the merged code until after the merge is done, and that's quite clearly not what we want either.

If the project were to take the rebasing workflow to its extreme and only accept fast-forward merges (to be clear, I don't advocate for that), then these would be identical same code; with the current workflow they are often quite different code and so testing the merged code is important.

and is this difference dependent on / affected by where the request comes from?

The above is always the case, no matter where the push comes from.

I think it's clear that we always need the pull-request to be tested because the merged code absolutely needs to be tested.

For the tests on push, there are two slightly different situations:

Prior to a PR being opened for a branch, the tests on push are incredibly valuable. CI shouldn't run for the first time only after a PR has been made and GitHub starts emailing people that there's a PR that wants attention. We want branches to get tested so that they're good quality and working before people open PRs, otherwise we end up with even more work-in-progress PRs that shouldn't yet even be vying for developer attention.
Once a PR is already open, the tests on push are not offering a huge amount of value, unless the push tests pass and the merge ones fail or vice versa! As demonstrated above, they are not a double run and they are not duplicates, but they are not providing immensely valuable information. We could consider disabling this subset of test on push (but note that is not turning off all tests on push as was proposed earlier). However, I'm yet to see a way of doing so.

lucas-wilkins commented 2 years ago

@llimeht Thanks.

Prior to a PR being opened for a branch, the tests on push are incredibly valuable. CI shouldn't run for the first time only after a PR has been made and GitHub starts emailing people that there's a PR that wants attention. We want branches to get tested so that they're good quality and working before people open PRs, otherwise we end up with even more work-in-progress PRs that shouldn't yet even be vying for developer attention.

I'm not so sure about this. I tend not to push very regularly except for when making updates to PRs. Perhaps I should, but its not like anyone else will be looking at it. I don't know about other people though. I had a related discussion with @pbeaucage at the code camp. If you want to see if the tests pass you can always make a draft PR. If I understand the docs correctly, updates to the draft should trigger the pull-request action.

So, generally, I'm not sure of the value of push CI.

rozyczko commented 2 years ago

Installer build on push is probably not that important. As @lucas-wilkins says, if you need to access the installers, the best way is to open a draft PR. Testing, however, is important but it does not need to be tied to the pyinstaller step, which is the most expensive part of the build. Run setup.py build on every push, run pytest on every push but limit the installer packaging to releases (including draft).

wpotrzebowski commented 2 years ago

Not sure if this is a helpful solution but how about switching off installer builds by default and then (if it is possible) have on request build (for example whne specific label is set)? We used to have simillar mechanism previously when we had CI based on Jenkins and it somehow worked. Not sure if this https://stackoverflow.com/questions/62325286/run-github-actions-when-pull-requests-have-a-specific-label works when you change/set label from already created PR.

Historically we had quite a few of issues that could be caught earlier by testing installers. Therefore I think it is still useful to have this possibility/practice in place.However, I also think our code is more healthy now and better tested and I agree it may be sometimes overkill to download and test installers for each PR.

rozyczko commented 2 years ago

What Wojtek suggests is a good middle ground between restrictive builds and flexibility. The label can be used to control only the push action, since we probably want to keep making installers for all pull request actions. However, it might also be beneficial to allow on-request creation of installers for PRs.

graph TD;
   subgraph 2. Controlled Push builds;
    A[On Pull Request] --> D[build, tests, installer];
    B[On Push] --> C{Build installer? };
    C-->|Yes| D;
    C -->|No| E[Build, test];
   end
    subgraph 1. Controlled PR and Push builds;
       AA[On Pull Request] --> CC[Build, test];
       BB[On Push] --> CC;
       CC --> DD{Build installer?};
       DD --> |Yes| EE[Build installer];
       DD --> |No| FF[Exit];
end

lucas-wilkins commented 1 year ago

Other possibilities are checking the PR text/title for specific keywords or phrases.

Could even have it look at the state of a check box in the PR text via ${{ github.event.pull_request.body }}, I think

krzywon commented 5 months ago

I believe this is now resolved in the main branch. Closing

SasView / sasview

CI is double-running on PRs #2378