Open mfisher87 opened 1 month ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 71.78%. Comparing base (
b7e07bf
) to head (73c1b28
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Special thanks to Chuck Daniels at Development Seed (@chuckwondo) for this work!
In this workflow, integration tests will fail when triggered by a person without permissions. The message that is surfaced in the GUI will tell the maintainer what to do to resolve the issue: re-run the tests after reviewing for security. A link to a GitHub blog post on the subject is included.
This is awesome! Thanks @chuckwondo and @mfisher87 for bringing it over.
However, I think it returns us to our earlier issue of having integration tests run automatically for every PR that's opened (and also for every commit to those PRs). We only want it to run, ideally, once per PR (unless it's rerun after dealing with a failure).
We only want it to run, ideally, once per PR (unless it's rerun after dealing with a failure).
I love this idea, but I'm truly not sure how to make that work with a good contributor/reviewer experience yet. For all the great things GH Actions enable, Chuck's giant amazing comment in this YAML file illustrates one of its weaknesses (IMO) -- mentally modeling and handling triggers outside the narrow happy path.
We have made some improvements towards reducing the number of runs here: (1) PRs from forks will always fail until re-run manually, so we can fully control the number of runs. (2) PRs from non-forks will run only on every push (not every commit), so we with "write" access to the repo can reduce the number of runs by pushing less frequently. (3) Concurrency-cancelling is enabled, so runs will get canceled when a new run is triggered from the same branch.
I know this isn't perfect, but I think we're stepping closer to the balance we're looking for.
We only want it to run, ideally, once per PR (unless it's rerun after dealing with a failure).
We have made some improvements towards reducing the number of runs here: (1) PRs from forks will always fail until re-run manually, so we can fully control the number of runs. (2) PRs from non-forks will run only on every push (not every commit), so we with "write" access to the repo can reduce the number of runs by pushing less frequently. (3) Concurrency-cancelling is enabled, so runs will get canceled when a new run is triggered from the same branch.
Just a minor clarification: regardless of whether or not the PR is from a fork, as long as the author has write permission to the original repo (i.e., maintainers), the integration tests will run. In other words, the integration test "gate" is based upon the PR author's permission, not on where the PR originates.
Just a minor clarification: regardless of whether or not the PR is from a fork, as long as the author has write permission to the original repo (i.e., maintainers), the integration tests will run. In other words, the integration test "gate" is based upon the PR author's permission, not on where the PR originates.
Thanks for clarifying! I was oversimplifying :)
Just a minor clarification: regardless of whether or not the PR is from a fork, as long as the author has write permission to the original repo (i.e., maintainers), the integration tests will run. In other words, the integration test "gate" is based upon the PR author's permission, not on where the PR originates.
Herein is one of the ways icepyx is set up differently from earthaccess that makes
so we with "write" access to the repo can reduce the number of runs by pushing less frequently.
a bigger challenge. Many folks who might not meet a traditional definition of "maintainer" have write access to the repo in order to facilitate some of the user challenges that commonly come with contributing to a library (that's part of why there are also so many branch protections). This has been especially helpful for those still learning the ropes of GitHub and an open-source software contribution process. For some folks, simply asking them to "push less frequently" might not be meaningful, or might add burden to a workflow that is not yet comfortable to them. I know myself that sometimes all I have time for is one commit, even if I know there might be several in response to a PR review. Should I not push my local changes at the end of the day then to avoid triggering an unnecessary test run? That feels like a bad habit to promote.
We could potentially partially address the above (assuming it's technically doable) by creating a group that specifically has permissions to trigger integration tests. However, I still think that it's a bad idea to push onto those "maintainers" a specific way of working (i.e. how often one pushes their commits) in order to address this problem.
In summary, my suggestion is to have the integration tests triggered:
For some folks, simply asking them to "push less frequently" might not be meaningful, or might add burden to a workflow that is not yet comfortable to them. I know myself that sometimes all I have time for is one commit, even if I know there might be several in response to a PR review. Should I not push my local changes at the end of the day then to avoid triggering an unnecessary test run? That feels like a bad habit to promote.
I don't think this is something we should be encouraging, especially for learners. I only mentioned it as a lever we have available to us to reduce the number of runs. I support premise of optimizing for reducing the environmental impact of unnecessary test runs, but I think not having automated integration tests runs on PRs is a fairly high cost to pay. Are you also concerned about incurring costs from running the tests too often? I don't think we're in that ballpark, but I could be wrong.
In summary, my suggestion is to have the integration tests triggered:
for PRs to main (and main will have a branch protection that requires the integration tests to pass)
for pushes to main and development (so we have a "current status")
when intentionally triggered either manually or from within a commit message, assuming the triggerer has write permissions to the repo.
It sounds like the way things are now, except the bolded part is not supported. I'm not sure how to do that right now.
but I think not having automated integration tests runs on PRs is a fairly high cost to pay.
Agreed, so I'm wondering if we're using terminology the same way. In an ideal world, we intend to set up an integration test suite that uses it's own data source (the repo exists, but it never got set up with data and integrated here) and is able to test much of the read-in functionality and other tasks that require data. This set of tests will be independent of any tests that require an earthdata login and actually try to download data from NSIDC (the "behind_NSIDC_login" pieces). It's the "can we download data" tests I'm most interested in strictly limiting because: (1) that was part of the understanding with NSIDC for initially setting up any automated tests; (2) we run the risk of beginning to skew our own usage statistics if we're constantly ordering and downloading data with CI. This forces additional layers of processing onto whomever at NSIDC is supplying me with usage statistics (since I get aggregate data without any identifying info like usernames).
It sounds like the way things are now, except the bolded part is not supported.
Yes, but this PR is suggesting we change it to run much more frequently once a PR is opened (and removes the current status runs):
on:
pull_request:
branches:
- "main" # Release PRs
- "development" # Feature PRs
pull_request_target:
branches:
- "main" # Release PRs
- "development" # Feature PRs
but I think not having automated integration tests runs on PRs is a fairly high cost to pay.
Agreed, so I'm wondering if we're using terminology the same way. In an ideal world, we intend to set up an integration test suite that uses it's own data source (the repo exists, but it never got set up with data and integrated here) and is able to test much of the read-in functionality and other tasks that require data. This set of tests will be independent of any tests that require an earthdata login and actually try to download data from NSIDC (the "behind_NSIDC_login" pieces). It's the "can we download data" tests I'm most interested in strictly limiting because: (1) that was part of the understanding with NSIDC for initially setting up any automated tests; (2) we run the risk of beginning to skew our own usage statistics if we're constantly ordering and downloading data with CI. This forces additional layers of processing onto whomever at NSIDC is supplying me with usage statistics (since I get aggregate data without any identifying info like usernames).
Thanks, this context really helps. Can you clarify what you mean by the bolded part? Is there a repo you can link us to?
Does the switch to Harmony change this situation? If so, how?
I'm thinking perhaps it makes sense to use pytest's "marks" to indicate which tests are doing costly data downloads and therefore need to be limited. Then we can run some integration tests more frequently without those impacts, and run the full integration test suite less frequently.
It sounds like the way things are now, except the bolded part is not supported.
Yes, but this PR is suggesting we change it to run much more frequently once a PR is opened (and removes the current status runs).
Right. Without the full understanding of the context you've provided here, I didn't understand the impact. Now that I have a better understanding, I still think there are some integration tests we can run more frequently like this, and enable PRs from forks to be more thoroughly tested. Then the costly ones could be run more in line with the description you posted above?
Special thanks to Chuck Daniels at Development Seed (@chuckwondo) for this work!
In this workflow, integration tests will fail when triggered by a person without permissions. The message that is surfaced in the GUI will tell the maintainer what to do to resolve the issue: re-run the tests after reviewing for security. A link to a GitHub blog post on the subject is included.