SNEWS2 / snewpy

A Python package for working with supernova neutrinos
https://snewpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
26 stars 19 forks source link

Skip integration test for draft PRs #227

Closed JostMigenda closed 1 year ago

JostMigenda commented 1 year ago

In #128, we changed the GitHub workflow so that integration test would not run on draft PRs. Unfortunately, with the solution we found back then, it required manual activity by a developer to run; so it did not run reliably on PRs, which has caused some problems recently.

After looking at the GitHub documentation some more, I think I’ve found a better way to achieve the same thing. I will test in this PR.

JostMigenda commented 1 year ago

In 728da55 you can see nicely that the “Integration Tests / build” job is skipped for this draft PR.

JostMigenda commented 1 year ago

Only downside now is that the integration test doesn’t re-run if a draft is marked as ready for review. But that should be easy to fix in the following commit.

Edited to add: There we go! Skipped initially, then ran after the PR was marked as ready for review.

(FWIW, GitHub documentation for the pull_request event is here (for activity types) and here (for the format of github.event.pull_request).)

github-actions[bot] commented 1 year ago

Result of Benchmark Tests

Benchmark Min Max Mean
python/snewpy/test/test_05_snowglobes.py::test_simulation_chain_benchmark 5.53 5.80 5.70 +- 0.12
JostMigenda commented 1 year ago

Before we review and merge this, one more question: As part of the integration test, we currently run a timing benchmark and publish the results in a comment on the PR. (@Sheshuk originally added that in https://github.com/SNEWS2/snewpy/pull/132/commits/3bb7899c7a01b524d0b1fe499569b8c16d5820c8)

Over the past year, I don’t remember us ever using the results of that benchmark, because they are so unreliable—it’s random chance whether the workflow runs on a GitHub worker machine that happens to be fast or slow. (E.g. in the comment above, if you view the edit history you can see that the mean time was between 3.49 and 5.47 for the exact same commit.) Instead, we run manual benchmarks anyway whenever we discuss performance improvements. I would therefore suggest we remove this benchmark (and the workflow that publishes the results).

JostMigenda commented 1 year ago

Okay, I’ve removed all the benchmarking code. @mcolomermolla, do you want to take a quick look and post an approving review if you’re happy with this?